date:20171113

[jira] [Comment Edited] (HIVE-17856) MM tables - IOW is not ACID compliant

2017-11-13 Thread Steve Yeom (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251038#comment-16251038
 ] 

Steve Yeom edited comment on HIVE-17856 at 11/14/17 7:42 AM:
-

Again.. using a unit test context along with FILE_OP_LOGGER seems to make it 
easier to clarify the errors of mm_all.q and mm_loaddata.q tests failures, 
regarding whether one is a failure or an indicator of fixed bug.


was (Author: steveyeom2017):
Again.. using a unit test context seems to make it easier to clarify the errors 
of mm_all.q and mm_loaddata.q tests failures, 
regarding whether one is a failure or an indicator of fixed bug.

> MM tables - IOW is not ACID compliant
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
> Attachments: HIVE-17856.1.patch, HIVE-17856.2.patch, 
> HIVE-17856.3.patch, HIVE-17856.4.patch, HIVE-17856.5.patch, 
> HIVE-17856.6.patch, HIVE-17856.7.patch, HIVE-17856.8.patch
>
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17856) MM tables - IOW is not ACID compliant

2017-11-13 Thread Steve Yeom (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251038#comment-16251038
 ] 

Steve Yeom commented on HIVE-17856:
---

Again.. using a unit test context seems to make it easier to clarify the errors 
of mm_all.q and mm_loaddata.q tests failures, 
regarding whether one is a failure or an indicator of fixed bug.

> MM tables - IOW is not ACID compliant
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
> Attachments: HIVE-17856.1.patch, HIVE-17856.2.patch, 
> HIVE-17856.3.patch, HIVE-17856.4.patch, HIVE-17856.5.patch, 
> HIVE-17856.6.patch, HIVE-17856.7.patch, HIVE-17856.8.patch
>
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-10179) Optimization for SIMD instructions in Hive

2017-11-13 Thread liyunzhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251015#comment-16251015
 ] 

liyunzhang edited comment on HIVE-10179 at 11/14/17 7:28 AM:
-

[~teddy.choi]: i want ask a question about 
[DoubleColAddRepeatingDoubleColumnBench|https://github.com/apache/hive/blob/master/itests/hive-jmh/src/main/java/org/apache/hive/benchmark/vectorization/VectorizedArithmeticBench.java#L53].
 Why should we test {{DoubleColAddRepeatingDoubleColumnBench}}, in my view, 
this test relates col1+col2 and the elements in col2 is same.  Is there any 
difference between {{DoubleColAddRepeatingDoubleColumnBench}} and 
{{DoubleColAddDoubleColumnBench}} in SIMD instructions?
I add some code in VectorizedArithmeticBench.java like following
{code}
  public static class DoubleColAddDoubleColumnBench extends AbstractExpression {
@Override
public void setup() {
  rowBatch = buildRowBatch(new DoubleColumnVector(), 2, 
getDoubleColumnVector(),
  getDoubleColumnVector());
  expression = new DoubleColAddDoubleColumn(0, 1, 2); 
}   
  }
{code}

After testing {{DoubleColAddDoubleColumnBench}} and 
{{DoubleColAddRepeatingDoubleColumnBench}}, I found
|| ||AVX1||AVX2|| perf improvement ||
| DoubleColAddDoubleColumnBench |150709|159073| 5% |
|  DoubleColAddRepeatingDoubleColumnBench |  111093| 95520 |14%  |
 
It is very interesting that great improvement on 
{{DoubleColAddRepeatingDoubleColumnBench}} while no obvious improvement on 
{{DoubleColAddDoubleColumnBench}}
I guess the goal to add {{DoubleColAddRepeatingDoubleColumnBench}} is to test 
whether there is benefit from SIMD instructions if one vector add a constant 
value or not? Is my understanding right?


was (Author: kellyzly):
[~teddy.choi]: i want ask a question about 
[DoubleColAddRepeatingDoubleColumnBench|https://github.com/apache/hive/blob/master/itests/hive-jmh/src/main/java/org/apache/hive/benchmark/vectorization/VectorizedArithmeticBench.java#L53].
 Why should we test {{DoubleColAddRepeatingDoubleColumnBench}}, in my view, 
this test relates col1+col2 and the elements in col2 is same.  Is there any 
difference between {{DoubleColAddRepeatingDoubleColumnBench}} and 
{{DoubleColAddDoubleColumnBench}} in SIMD instructions?
I add some code in VectorizedArithmeticBench.java like following
{code}
  public static class DoubleColAddDoubleColumnBench extends AbstractExpression {
@Override
public void setup() {
  rowBatch = buildRowBatch(new DoubleColumnVector(), 2, 
getDoubleColumnVector(),
  getDoubleColumnVector());
  expression = new DoubleColAddDoubleColumn(0, 1, 2); 
}   
  }
{code}

After testing {{DoubleColAddDoubleColumnBench}} and 
{{DoubleColAddRepeatingDoubleColumnBench}}, I found
|| ||AVX1||AVX2|| perf improvement ||
| DoubleColAddDoubleColumnBench |159588  |158131  | 0.9% |
|  DoubleColAddRepeatingDoubleColumnBench |  111093| 95520 |14%  |
 
It is very interesting that great improvement on 
{{DoubleColAddRepeatingDoubleColumnBench}} while no obvious improvement on 
{{DoubleColAddDoubleColumnBench}}
I guess the goal to add {{DoubleColAddRepeatingDoubleColumnBench}} is to test 
whether there is benefit from SIMD instructions if one vector add a constant 
value or not? Is my understanding right?

> Optimization for SIMD instructions in Hive
> --
>
> Key: HIVE-10179
> URL: https://issues.apache.org/jira/browse/HIVE-10179
> Project: Hive
>  Issue Type: Improvement
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: optimization
>
> [SIMD|http://en.wikipedia.org/wiki/SIMD] instuctions could be found in most 
> of current CPUs, such as Intel's SSE2, SSE3, SSE4.x, AVX and AVX2, and it 
> would help Hive to outperform if we can vectorize the mathematical 
> manipulation part of Hive. This umbrella JIRA may contains but not limited to 
> the subtasks like:
> # Code schema adaption, current JVM is quite strictly on the code schema 
> which could be transformed into SIMD instructions during execution. 
> # New implementation of mathematical manipulation part of Hive which designed 
> to be optimized for SIMD instructions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-18056) CachedStore: Have a whitelist/blacklist config to allow selective caching of tables/partitions and allow read while prewarming

2017-11-13 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-18056:
---

Assignee: Vaibhav Gumashta

> CachedStore: Have a whitelist/blacklist config to allow selective caching of 
> tables/partitions and allow read while prewarming
> --
>
> Key: HIVE-18056
> URL: https://issues.apache.org/jira/browse/HIVE-18056
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-10179) Optimization for SIMD instructions in Hive

2017-11-13 Thread liyunzhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251015#comment-16251015
 ] 

liyunzhang commented on HIVE-10179:
---

[~teddy.choi]: i want ask a question about 
[DoubleColAddRepeatingDoubleColumnBench|https://github.com/apache/hive/blob/master/itests/hive-jmh/src/main/java/org/apache/hive/benchmark/vectorization/VectorizedArithmeticBench.java#L53].
 Why should we test {{DoubleColAddRepeatingDoubleColumnBench}}, in my view, 
this test relates col1+col2 and the elements in col2 is same.  Is there any 
difference between {{DoubleColAddRepeatingDoubleColumnBench}} and 
{{DoubleColAddDoubleColumnBench}} in SIMD instructions?
I add some code in VectorizedArithmeticBench.java like following
{code}
  public static class DoubleColAddDoubleColumnBench extends AbstractExpression {
@Override
public void setup() {
  rowBatch = buildRowBatch(new DoubleColumnVector(), 2, 
getDoubleColumnVector(),
  getDoubleColumnVector());
  expression = new DoubleColAddDoubleColumn(0, 1, 2); 
}   
  }
{code}

After testing {{DoubleColAddDoubleColumnBench}} and 
{{DoubleColAddRepeatingDoubleColumnBench}}, I found
|| ||AVX1||AVX2|| perf improvement ||
| DoubleColAddDoubleColumnBench |159588  |158131  | 0.9% |
|  DoubleColAddRepeatingDoubleColumnBench |  111093| 95520 |14%  |
 
It is very interesting that great improvement on 
{{DoubleColAddRepeatingDoubleColumnBench}} while no obvious improvement on 
{{DoubleColAddDoubleColumnBench}}
I guess the goal to add {{DoubleColAddRepeatingDoubleColumnBench}} is to test 
whether there is benefit from SIMD instructions if one vector add a constant 
value or not? Is my understanding right?

> Optimization for SIMD instructions in Hive
> --
>
> Key: HIVE-10179
> URL: https://issues.apache.org/jira/browse/HIVE-10179
> Project: Hive
>  Issue Type: Improvement
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: optimization
>
> [SIMD|http://en.wikipedia.org/wiki/SIMD] instuctions could be found in most 
> of current CPUs, such as Intel's SSE2, SSE3, SSE4.x, AVX and AVX2, and it 
> would help Hive to outperform if we can vectorize the mathematical 
> manipulation part of Hive. This umbrella JIRA may contains but not limited to 
> the subtasks like:
> # Code schema adaption, current JVM is quite strictly on the code schema 
> which could be transformed into SIMD instructions during execution. 
> # New implementation of mathematical manipulation part of Hive which designed 
> to be optimized for SIMD instructions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18002) add group support for pool mappings

2017-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251010#comment-16251010
 ] 

Hive QA commented on HIVE-18002:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897416/HIVE-18002.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11384 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ppd_union_view]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7800/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7800/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7800/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897416 - PreCommit-HIVE-Build

> add group support for pool mappings
> ---
>
> Key: HIVE-18002
> URL: https://issues.apache.org/jira/browse/HIVE-18002
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18002.01.patch, HIVE-18002.02.patch, 
> HIVE-18002.02.patch, HIVE-18002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17489) Separate client-facing and server-side Kerberos principals, to support HA

2017-11-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250954#comment-16250954
 ] 

Lefty Leverenz commented on HIVE-17489:
---

Thanks for the doc, [~mithun].  I did some minor editing, documented the other 
new parameter (*hive.server2.authentication.client.kerberos.principal*), and 
added cross-references between them.

Please review and let me know if the cross-references were a good idea or not.  
(Does HA mean High Availability?)

* [hive.metastore.client.kerberos.principal | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.client.kerberos.principal]
* [hive.server2.authentication.client.kerberos.principal | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.authentication.client.kerberos.principal]

> Separate client-facing and server-side Kerberos principals, to support HA
> -
>
> Key: HIVE-17489
> URL: https://issues.apache.org/jira/browse/HIVE-17489
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Mithun Radhakrishnan
>Assignee: Thiruvel Thirumoolan
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-17489.2-branch-2.patch, HIVE-17489.2.patch, 
> HIVE-17489.2.patch, HIVE-17489.3-branch-2.patch, HIVE-17489.3.patch, 
> HIVE-17489.4-branch-2.patch, HIVE-17489.4.patch
>
>
> On deployments of the Hive metastore where a farm of servers is fronted by a 
> VIP, the hostname of the VIP (e.g. {{mycluster-hcat.blue.myth.net}}) will 
> differ from the actual boxen in the farm (.e.g 
> {{mycluster-hcat-\[0..3\].blue.myth.net}}).
> Such a deployment messes up Kerberos auth, with principals like 
> {{hcat/mycluster-hcat.blue.myth@grid.myth.net}}. Host-based checks will 
> disallow servers behind the VIP from using the VIP's hostname in its 
> principal when accessing, say, HDFS.
> The solution would be to decouple the server-side principal (used to access 
> other services like HDFS as a client) from the client-facing principal (used 
> from Hive-client, BeeLine, etc.).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17904) handle internal Tez AM restart in registry and WM

2017-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250926#comment-16250926
 ] 

Hive QA commented on HIVE-17904:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897421/HIVE-17904.05.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11383 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ppd_union_view]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_3]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7799/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7799/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7799/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897421 - PreCommit-HIVE-Build

> handle internal Tez AM restart in registry and WM
> -
>
> Key: HIVE-17904
> URL: https://issues.apache.org/jira/browse/HIVE-17904
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17904.01.patch, HIVE-17904.02.patch, 
> HIVE-17904.03.patch, HIVE-17904.04.patch, HIVE-17904.05.patch, 
> HIVE-17904.patch, HIVE-17904.patch
>
>
> After the plan update patch is committed. The current code doesn't account 
> very well for it; registry may have races, and an event needs to be added to 
> WM when some AM resets, at least to make sure we discard the update errors 
> that pertain to the old AM. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15504) ArrayIndexOutOfBoundsException in GenericUDFTrunc::initialize

2017-11-13 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15504:

   Resolution: Fixed
 Assignee: Rajesh Balamohan
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master.

> ArrayIndexOutOfBoundsException in GenericUDFTrunc::initialize
> -
>
> Key: HIVE-15504
> URL: https://issues.apache.org/jira/browse/HIVE-15504
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HIVE-15504.1.patch
>
>
> SELECT TRUNC(d_date) FROM test_date_dim throws ArrayIndexOutOfBounds 
> exception.
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFTrunc.initialize(GenericUDFTrunc.java:128)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:139)
> at 
> org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:236)
> at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1102)
> at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1357)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:227)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15739) Incorrect exception message in PartExprEvalUtils

2017-11-13 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15739:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Mark!

> Incorrect exception message in PartExprEvalUtils
> 
>
> Key: HIVE-15739
> URL: https://issues.apache.org/jira/browse/HIVE-15739
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Mark Wagner
>Assignee: Mark Wagner
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-15739.1.patch.txt
>
>
> The check is on partSpec, not partProps:
> {noformat}
> if (partSpec.size() != partKeyTypes.length) {
> throw new HiveException("Internal error : Partition Spec size, " + 
> partProps.size() +
> " doesn't match partition key definition size, " + 
> partKeyTypes.length);
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15883) HBase mapped table in Hive insert fail for decimal

2017-11-13 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250871#comment-16250871
 ] 

Ashutosh Chauhan commented on HIVE-15883:
-

+1

> HBase mapped table in Hive insert fail for decimal
> --
>
> Key: HIVE-15883
> URL: https://issues.apache.org/jira/browse/HIVE-15883
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-15883.patch
>
>
> CREATE TABLE hbase_table (
> id int,
> balance decimal(15,2))
> ROW FORMAT DELIMITED
> COLLECTION ITEMS TERMINATED BY '~'
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping"=":key,cf:balance#b");
> insert into hbase_table values (1,1);
> 
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
> ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.RuntimeException: 
> Hive internal error.
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:733)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
> ... 9 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: 
> java.lang.RuntimeException: Hive internal error.
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:286)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:668)
> ... 15 more
> Caused by: java.lang.RuntimeException: Hive internal error.
> at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitive(LazyUtils.java:328)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:220)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serializeField(HBaseRowSerializer.java:194)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:118)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:282)
> ... 16 more 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17964) HoS: some spark configs doesn't require re-creating a session

2017-11-13 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-17964:
--
Attachment: HIVE-17964.2.patch

Update to fix tests. There're some issue with {{spark_job_max_tasks.q}} and 
{{spark_stage_max_tasks.q}}: since we don't check num of tasks when the job 
first reaches RUNNING state, it's possible the check is bypassed if the job 
finishes very quickly. Therefore I use a dummy script that does nothing but 
sleep, so that we make sure the check is enforced.

> HoS: some spark configs doesn't require re-creating a session
> -
>
> Key: HIVE-17964
> URL: https://issues.apache.org/jira/browse/HIVE-17964
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Attachments: HIVE-17964.1.patch, HIVE-17964.2.patch
>
>
> I guess the {{hive.spark.}} configs were initially intended for the RSC. 
> Therefore when they're changed, we'll re-create the session for them to take 
> effect. There're some configs not related to RSC that also start with 
> {{hive.spark.}}. We'd better rename them so that we don't unnecessarily 
> re-create sessions, which is usually time consuming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-14497) Fine control for using materialized views in rewriting

2017-11-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15756700#comment-15756700
 ] 

Lefty Leverenz edited comment on HIVE-14497 at 11/14/17 5:04 AM:
-

Doc note:  This needs to be documented with a new section in the DDL wikidoc, 
perhaps after Create/Drop/Alter View.

* [Hive DDL | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-HiveDataDefinitionLanguage]

Added a TODOC2.2 label.

Update 14/Nov/17:  Changed the label to TODOC2.3.


was (Author: le...@hortonworks.com):
Doc note:  This needs to be documented with a new section in the DDL wikidoc, 
perhaps after Create/Drop/Alter View.

* [Hive DDL | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-HiveDataDefinitionLanguage]

Added a TODOC2.2 label.

> Fine control for using materialized views in rewriting
> --
>
> Key: HIVE-14497
> URL: https://issues.apache.org/jira/browse/HIVE-14497
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.3
> Fix For: 2.3.0
>
>
> Follow-up of HIVE-14495. Since the number of materialized views in the system 
> might grow very large, and query rewriting using materialized views might be 
> very expensive, we need to include a mechanism to enable/disable materialized 
> views for query rewriting.
> Thus, we should extend the CREATE MATERIALIZED VIEW statement as follows:
> {code:sql}
> CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db_name.]materialized_view_name
>   [BUILD DEFERRED]
>   [ENABLE REWRITE] -- NEW!
>   [COMMENT materialized_view_comment]
>   [
>[ROW FORMAT row_format] 
>[STORED AS file_format]
>  | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]
>   ]
>   [LOCATION hdfs_path]
>   [TBLPROPERTIES (property_name=property_value, ...)]
>   AS select_statement;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-14497) Fine control for using materialized views in rewriting

2017-11-13 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14497:
--
Labels: TODOC2.3  (was: TODOC2.2)

> Fine control for using materialized views in rewriting
> --
>
> Key: HIVE-14497
> URL: https://issues.apache.org/jira/browse/HIVE-14497
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.3
> Fix For: 2.3.0
>
>
> Follow-up of HIVE-14495. Since the number of materialized views in the system 
> might grow very large, and query rewriting using materialized views might be 
> very expensive, we need to include a mechanism to enable/disable materialized 
> views for query rewriting.
> Thus, we should extend the CREATE MATERIALIZED VIEW statement as follows:
> {code:sql}
> CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db_name.]materialized_view_name
>   [BUILD DEFERRED]
>   [ENABLE REWRITE] -- NEW!
>   [COMMENT materialized_view_comment]
>   [
>[ROW FORMAT row_format] 
>[STORED AS file_format]
>  | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]
>   ]
>   [LOCATION hdfs_path]
>   [TBLPROPERTIES (property_name=property_value, ...)]
>   AS select_statement;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18054) Make Lineage work with concurrent queries on a Session

2017-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250855#comment-16250855
 ] 

Hive QA commented on HIVE-18054:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897400/HIVE-18054.1.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 71 failed/errored test(s), 11384 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas_blobstore_to_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas_hdfs_to_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_blobstore_to_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_empty_into_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[orc_format_nonpart]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[orc_format_part]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[orc_nonstd_partitions_loc]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[rcfile_format_nonpart]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[rcfile_format_part]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[rcfile_nonstd_partitions_loc]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[zero_rows_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_index] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[database_drop] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[drop_table_with_index] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auth] (batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto] (batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_empty] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_file_format] 
(batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_mult_tables] 
(batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_mult_tables_compact]
 (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_multiple] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_partitioned] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_self_join] 
(batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_unused] 
(batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_update] 
(batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_bitmap1] 
(batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_bitmap2] 
(batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_bitmap] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_bitmap_auto_partitioned]
 (batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_bitmap_compression]
 (batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_bitmap_rc] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_compact] 
(batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_compact_1] 
(batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_compact_2] 
(batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_compact_3] 
(batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_compact_binary_search]
 (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_compression] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_creation] 
(batchId=86)

[jira] [Commented] (HIVE-17809) Implement per pool trigger validation and move sessions across pools

2017-11-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250849#comment-16250849
 ] 

Lefty Leverenz commented on HIVE-17809:
---

Should this be documented in the wiki?

> Implement per pool trigger validation and move sessions across pools
> 
>
> Key: HIVE-17809
> URL: https://issues.apache.org/jira/browse/HIVE-17809
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 3.0.0
>
> Attachments: HIVE-17809.1.patch, HIVE-17809.2.patch, 
> HIVE-17809.3.patch, HIVE-17809.4.patch, HIVE-17809.5.patch, HIVE-17809.6.patch
>
>
> HIVE-17508 trigger validation is applied for all pools at once. This is 
> follow up to implement trigger validation at per pool level. 
> This should also implement resolution for multiple applicable actions, as per 
> the RB discussion



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15491) Failures are masked/swallowed in GenericUDTFJSONTuple::process().

2017-11-13 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250824#comment-16250824
 ] 

Ashutosh Chauhan commented on HIVE-15491:
-

+1

> Failures are masked/swallowed in GenericUDTFJSONTuple::process().
> -
>
> Key: HIVE-15491
> URL: https://issues.apache.org/jira/browse/HIVE-15491
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-15491.patch
>
>
> I draw your attention to the following piece of code in 
> {{GenericUDTFJSONTuple::process()}}:
> {code:java}
>   @Override
>   public void process(Object[] o) throws HiveException {
>   ...
> for (int i = 0; i < numCols; ++i) {
> if (retCols[i] == null) {
>   retCols[i] = cols[i]; // use the object pool rather than creating a 
> new object
> }
> Object extractObject = ((Map)jsonObj).get(paths[i]);
> if (extractObject instanceof Map || extractObject instanceof List) {
>   retCols[i].set(MAPPER.writeValueAsString(extractObject));
> } else if (extractObject != null) {
>   retCols[i].set(extractObject.toString());
> } else {
>   retCols[i] = null;
> }
>   }
>   forward(retCols);
>   return;
> } catch (Throwable e) {  <= Yikes.
>   LOG.error("JSON parsing/evaluation exception" + e);
>   forward(nullCols);
> }
>   }
> {code}
> The error-handling here seems suspect. Judging from the error message, the 
> intention here seems to be to catch JSON-specific errors arising from 
> {{MAPPER.readValue()}} and {{MAPPER.writeValueAsString()}}. By catching 
> {{Throwable}}, this code masks the errors that arise from the call to 
> {{forward(retCols)}}.
> I just ran into this in production. A user with a nearly exhausted HDFS quota 
> attempted to use {{json_tuple}} to extract fields from json strings in his 
> data. The data turned out to have large record counts and the query used over 
> 25K mappers. Every task failed to create a {{RecordWriter}}, thanks to the 
> exhausted quota. But the thrown exception was swallowed in the code above. 
> {{process()}} ignored the failure for the record and proceeded to the next 
> one. Eventually, this resulted in DDoS-ing the name-node.
> I'll have a patch for this shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18050) LlapServiceDriver shoud split HIVE_AUX_JARS_PATH by ':' instead of ','

2017-11-13 Thread Aegeaner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250825#comment-16250825
 ] 

Aegeaner commented on HIVE-18050:
-

That's right, should call addAuxJarsToSet with different delimiters.

> LlapServiceDriver shoud split HIVE_AUX_JARS_PATH by ':' instead of ','
> --
>
> Key: HIVE-18050
> URL: https://issues.apache.org/jira/browse/HIVE-18050
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Clients
>Affects Versions: 2.3.0
>Reporter: Aegeaner
>Assignee: Aegeaner
>  Labels: pull-request-available
>
> LlapServiceDriver shoud split HIVE_AUX_JARS_PATH by ':' instead of ',' , 
> since in hive script the environment variable has been replaced:
> {code:java}
> elif [ "${HIVE_AUX_JARS_PATH}" != "" ]; then 
>   HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/,/:/g'`
>   if $cygwin; then
>   HIVE_AUX_JARS_PATH=`cygpath -p -w "$HIVE_AUX_JARS_PATH"`
>   HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/;/,/g'`
>   fi
>   AUX_CLASSPATH=${AUX_CLASSPATH}:${HIVE_AUX_JARS_PATH}
>   AUX_PARAM="file://$(echo ${HIVE_AUX_JARS_PATH} | sed 's/:/,file:\/\//g')"
> fi
> {code}
> But in the LLAP Service Driver, it's processed as :
> {code:java}
>  private void addAuxJarsToSet(HashSet auxJarSet, String auxJars) {
>   if (auxJars != null && !auxJars.isEmpty()) {
> // TODO: transitive dependencies warning?
> String[] jarPaths = auxJars.split(",");
> for (String jarPath : jarPaths) {
>   if (!jarPath.isEmpty()) {
> auxJarSet.add(jarPath);
>   }
> }
>   }
> }
>   };
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15504) ArrayIndexOutOfBoundsException in GenericUDFTrunc::initialize

2017-11-13 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250822#comment-16250822
 ] 

Ashutosh Chauhan commented on HIVE-15504:
-

+1

> ArrayIndexOutOfBoundsException in GenericUDFTrunc::initialize
> -
>
> Key: HIVE-15504
> URL: https://issues.apache.org/jira/browse/HIVE-15504
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Rajesh Balamohan
>Priority: Trivial
> Attachments: HIVE-15504.1.patch
>
>
> SELECT TRUNC(d_date) FROM test_date_dim throws ArrayIndexOutOfBounds 
> exception.
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFTrunc.initialize(GenericUDFTrunc.java:128)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:139)
> at 
> org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:236)
> at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1102)
> at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1357)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:227)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-14878) integrate MM tables into ACID: add separate ACID type

2017-11-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591168#comment-15591168
 ] 

Lefty Leverenz edited comment on HIVE-14878 at 11/14/17 4:21 AM:
-

How shall we track doc issues for this branch?  Some unnumbered branches have 
their own TODOC label and others have a separate JIRA issue for documentation.

Anyway, this should be documented when the branch gets merged into master.

* new table property "transactional_properties"="insert_only"
* new description for configuration property *hive.txn.operational.properties*

Edit 4/Nov/17:  Actually only the table property should be documented, because 
*hive.txn.operational.properties* is for internal use only (see HIVE-14035 
comments).

Edit 13/Nov/17:  Adding a TODOC3.0 label for the new table property 
"transactional_properties"="insert_only" because this was merged to master by 
HIVE-15212.

* [DDL -- table properties | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties]


was (Author: le...@hortonworks.com):
How shall we track doc issues for this branch?  Some unnumbered branches have 
their own TODOC label and others have a separate JIRA issue for documentation.

Anyway, this should be documented when the branch gets merged into master.

* new table property "transactional_properties"="insert_only"
* new description for configuration property *hive.txn.operational.properties*

Edit 4/Nov/17:  Actually only the table property should be documented, because 
*hive.txn.operational.properties* is for internal use only (see HIVE-14035 
comments).

Edit 13/Nov/17:  Adding a TODOC3.0 label for the new table property 
"transactional_properties"="insert_only" because this was merged to master by 
HIVE-15212.

> integrate MM tables into ACID: add separate ACID type
> -
>
> Key: HIVE-14878
> URL: https://issues.apache.org/jira/browse/HIVE-14878
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-14878.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-14878) integrate MM tables into ACID: add separate ACID type

2017-11-13 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14878:
--
Labels: TODOC3.0  (was: )

> integrate MM tables into ACID: add separate ACID type
> -
>
> Key: HIVE-14878
> URL: https://issues.apache.org/jira/browse/HIVE-14878
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-14878.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-14878) integrate MM tables into ACID: add separate ACID type

2017-11-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591168#comment-15591168
 ] 

Lefty Leverenz edited comment on HIVE-14878 at 11/14/17 4:18 AM:
-

How shall we track doc issues for this branch?  Some unnumbered branches have 
their own TODOC label and others have a separate JIRA issue for documentation.

Anyway, this should be documented when the branch gets merged into master.

* new table property "transactional_properties"="insert_only"
* new description for configuration property *hive.txn.operational.properties*

Edit 4/Nov/17:  Actually only the table property should be documented, because 
*hive.txn.operational.properties* is for internal use only (see HIVE-14035 
comments).

Edit 13/Nov/17:  Adding a TODOC3.0 label for the new table property 
"transactional_properties"="insert_only" because this was merged to master by 
HIVE-15212.


was (Author: le...@hortonworks.com):
How shall we track doc issues for this branch?  Some unnumbered branches have 
their own TODOC label and others have a separate JIRA issue for documentation.

Anyway, this should be documented when the branch gets merged into master.

* new table property "transactional_properties"="insert_only"
* new description for configuration property *hive.txn.operational.properties*

Edit 4/Nov/17:  Actually only the table property should be documented, because 
*hive.txn.operational.properties* is for internal use only (see HIVE-14035 
comments).

> integrate MM tables into ACID: add separate ACID type
> -
>
> Key: HIVE-14878
> URL: https://issues.apache.org/jira/browse/HIVE-14878
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-14878.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Issue Comment Deleted] (HIVE-14878) integrate MM tables into ACID: add separate ACID type

2017-11-13 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14878:
--
Comment: was deleted

(was: No doc needed:  This changes the description of 
*hive.txn.operational.properties* but it doesn't need to be documented because 
it's for internal use only.  (See HIVE-14035 comments, 21-22 Aug. 2016.)

HIVE-17458 changes the description again.)

> integrate MM tables into ACID: add separate ACID type
> -
>
> Key: HIVE-14878
> URL: https://issues.apache.org/jira/browse/HIVE-14878
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
> Fix For: hive-14535
>
> Attachments: HIVE-14878.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15739) Incorrect exception message in PartExprEvalUtils

2017-11-13 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250812#comment-16250812
 ] 

Ashutosh Chauhan commented on HIVE-15739:
-

+1

> Incorrect exception message in PartExprEvalUtils
> 
>
> Key: HIVE-15739
> URL: https://issues.apache.org/jira/browse/HIVE-15739
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Mark Wagner
>Assignee: Mark Wagner
>Priority: Minor
> Attachments: HIVE-15739.1.patch.txt
>
>
> The check is on partSpec, not partProps:
> {noformat}
> if (partSpec.size() != partKeyTypes.length) {
> throw new HiveException("Internal error : Partition Spec size, " + 
> partProps.size() +
> " doesn't match partition key definition size, " + 
> partKeyTypes.length);
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17904) handle internal Tez AM restart in registry and WM

2017-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250783#comment-16250783
 ] 

Hive QA commented on HIVE-17904:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897421/HIVE-17904.05.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11383 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ppd_union_view]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testApplyPlanQpChanges 
(batchId=281)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7797/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7797/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7797/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897421 - PreCommit-HIVE-Build

> handle internal Tez AM restart in registry and WM
> -
>
> Key: HIVE-17904
> URL: https://issues.apache.org/jira/browse/HIVE-17904
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17904.01.patch, HIVE-17904.02.patch, 
> HIVE-17904.03.patch, HIVE-17904.04.patch, HIVE-17904.05.patch, 
> HIVE-17904.patch, HIVE-17904.patch
>
>
> After the plan update patch is committed. The current code doesn't account 
> very well for it; registry may have races, and an event needs to be added to 
> WM when some AM resets, at least to make sure we discard the update errors 
> that pertain to the old AM. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-14535) add insert-only ACID tables to Hive

2017-11-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250768#comment-16250768
 ] 

Lefty Leverenz commented on HIVE-14535:
---

Doc note:  HIVE-15212 merged branch-14535 to master for release 3.0.0, so 
general documentation for this feature is needed in the wiki.

These configuration properties were added or changed by the merge:

* *hive.mm.avoid.s3.globstatus* (HIVE-14953) -- new config
* *hive.exim.test.mode* (HIVE-15019) -- new config
* *hive.txn.operational.properties* (HIVE-14878) -- description changed; 
internal so no doc needed

Added a TODOC3.0 label.

> add insert-only ACID tables to Hive 
> 
>
> Key: HIVE-14535
> URL: https://issues.apache.org/jira/browse/HIVE-14535
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
>
> Design doc: 
> https://docs.google.com/document/d/1b3t1RywfyRb73-cdvkEzJUyOiekWwkMHdiQ-42zCllY
> Feel free to comment.
> Update: we ended up going with sequence number based implementation
> Update #2: this feature has been partially merged with ACID; the new table 
> type is insert_only ACID, and the difference from the regular ACID is that it 
> only supports inserts on one hand; and that it has no restrictions on file 
> format, table type (bucketing), and much fewer restrictions on other 
> operations (export/import, list bucketing, etc.)
> Currently some features that used to work when it was separated are not 
> integrated properly; integration of these features is the remaining work in 
> this JIRA



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15019) handle import for MM tables

2017-11-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250764#comment-16250764
 ] 

Lefty Leverenz commented on HIVE-15019:
---

Doc note:  This adds *hive.exim.test.mode* to HiveConf.java and branch-14535 
has been merged to master for release 3.0.0 by HIVE-15212, so the wiki needs to 
be updated.

Although most test configs aren't documented, this one is different because it 
doesn't begin with "hive.test" and so wouldn't show up in a simple search.  
Therefore I recommend including it in the Test Properties section of 
Configuration Properties, perhaps with a cross-reference from the Transactions 
section (or a new subsection, if one is added for 
*hive.mm.avoid.s3.globstatus*).

* [Configuration Properties -- Test Properties | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-TestProperties]
* [Configuration Properties -- Transactions and Compactor | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-TransactionsandCompactor]

Added a TODOC3.0 label.

> handle import for MM tables
> ---
>
> Key: HIVE-15019
> URL: https://issues.apache.org/jira/browse/HIVE-15019
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-15019.WIP.patch, HIVE-15019.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15019) handle import for MM tables

2017-11-13 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15019:
--
Labels: TODOC3.0  (was: )

> handle import for MM tables
> ---
>
> Key: HIVE-15019
> URL: https://issues.apache.org/jira/browse/HIVE-15019
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-15019.WIP.patch, HIVE-15019.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-15212) merge branch into master

2017-11-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248816#comment-16248816
 ] 

Lefty Leverenz edited comment on HIVE-15212 at 11/14/17 3:10 AM:
-

Okay, thanks Sergey.

So far I've only found one configuration parameter added to master by this 
merge (*hive.mm.avoid.s3.globstatus* in HIVE-14953) but there may be a few more.

Update 13/Nov/17:  The merge also added *hive.exim.test.mode* (HIVE-15019) and 
changed the description of *hive.txn.operational.properties* (HIVE-14878) but 
the latter is internal and so doesn't need to be documented.  Most test configs 
aren't documented but perhaps *hive.exim.test.mode* should be because it 
wouldn't show up in a search for "hive.test.*" configs.


was (Author: le...@hortonworks.com):
Okay, thanks Sergey.

So far I've only found one configuration parameter added to master by this 
merge (*hive.mm.avoid.s3.globstatus* in HIVE-14953) but there may be a few more.

> merge branch into master
> 
>
> Key: HIVE-15212
> URL: https://issues.apache.org/jira/browse/HIVE-15212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-15212.01.patch, HIVE-15212.02.patch, 
> HIVE-15212.03.patch, HIVE-15212.04.patch, HIVE-15212.05.patch, 
> HIVE-15212.06.patch, HIVE-15212.07.patch, HIVE-15212.08.patch, 
> HIVE-15212.09.patch, HIVE-15212.10.patch, HIVE-15212.11.patch, 
> HIVE-15212.12.patch, HIVE-15212.12.patch, HIVE-15212.13.patch, 
> HIVE-15212.13.patch, HIVE-15212.14.patch, HIVE-15212.15.patch, 
> HIVE-15212.16.patch, HIVE-15212.17.patch, HIVE-15212.18.patch, 
> HIVE-15212.19.patch, HIVE-15212.20.patch, HIVE-15212.21.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-14953) don't use globStatus on S3 in MM tables

2017-11-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250738#comment-16250738
 ] 

Lefty Leverenz commented on HIVE-14953:
---

Doc note:  This adds *hive.mm.avoid.s3.globstatus* to HiveConf.java and 
branch-14535 has been merged to master for release 3.0.0 by HIVE-15212, so the 
wiki needs to be updated.

I'm not sure where *hive.mm.avoid.s3.globstatus* belongs in Configuration 
Properties.  Perhaps the Transactions section should have a subsection, 
although so far this is the only new parameter that needs to be documented.

* [Configuration Properties -- Transactions and Compactor | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-TransactionsandCompactor]

Added a TODOC3.0.0 label.

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-14953.01.patch, HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables

2017-11-13 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14953:
--
Labels: TODOC3.0  (was: )

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-14953.01.patch, HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-14878) integrate MM tables into ACID: add separate ACID type

2017-11-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250717#comment-16250717
 ] 

Lefty Leverenz commented on HIVE-14878:
---

No doc needed:  This changes the description of 
*hive.txn.operational.properties* but it doesn't need to be documented because 
it's for internal use only.  (See HIVE-14035 comments, 21-22 Aug. 2016.)

HIVE-17458 changes the description again.

> integrate MM tables into ACID: add separate ACID type
> -
>
> Key: HIVE-14878
> URL: https://issues.apache.org/jira/browse/HIVE-14878
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
> Fix For: hive-14535
>
> Attachments: HIVE-14878.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2017-11-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419834#comment-15419834
 ] 

Lefty Leverenz edited comment on HIVE-14035 at 11/14/17 2:48 AM:
-

Doc note:  Besides the design document, which should be added to the wiki, 
there is a new configuration parameter (*hive.txn.operational.properties*) that 
will need to be documented in the wiki for release 2.2.0.

* [Configuration Properties -- Transactions and Compactor | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-TransactionsandCompactor]

Added a TODOC2.2 label.

Update 13/Nov/17:  *hive.txn.operational.properties* does not need to be 
documented.  Its description is changed by HIVE-14878 and HIVE-17458.


was (Author: le...@hortonworks.com):
Doc note:  Besides the design document, which should be added to the wiki, 
there is a new configuration parameter (*hive.txn.operational.properties*) that 
will need to be documented in the wiki for release 2.2.0.

* [Configuration Properties -- Transactions and Compactor | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-TransactionsandCompactor]

Added a TODOC2.2 label.

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>  Labels: TODOC2.2
> Fix For: 2.3.0
>
> Attachments: Design.Document.Improving ACID performance in 
> Hive.01.docx, Design.Document.Improving ACID performance in Hive.02.docx, 
> HIVE-14035.02.patch, HIVE-14035.03.patch, HIVE-14035.04.patch, 
> HIVE-14035.05.patch, HIVE-14035.06.patch, HIVE-14035.07.patch, 
> HIVE-14035.08.patch, HIVE-14035.09.patch, HIVE-14035.10.patch, 
> HIVE-14035.11.patch, HIVE-14035.12.patch, HIVE-14035.13.patch, 
> HIVE-14035.14.patch, HIVE-14035.15.patch, HIVE-14035.16.patch, 
> HIVE-14035.17.patch, HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17882) Resource plan retrieval looks incorrect

2017-11-13 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17882:
-
Attachment: HIVE-17882.2.patch

Uploading committed patch

> Resource plan retrieval looks incorrect
> ---
>
> Key: HIVE-17882
> URL: https://issues.apache.org/jira/browse/HIVE-17882
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Harish Jaiprakash
> Fix For: 3.0.0
>
> Attachments: HIVE-17882.01.patch, HIVE-17882.2.patch
>
>
> {code}
> 0: jdbc:hive2://localhost:1> show resource plan global;
> +--+-++
> | rp_name  | status  | query_parallelism  |
> +--+-++
> | global   | 1   | NULL   |
> +--+-++
> {code}
> looks like status and query_parallelism got swapped.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17882) Resource plan retrieval looks incorrect

2017-11-13 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17882:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Updated golden file and committed patch to master.

> Resource plan retrieval looks incorrect
> ---
>
> Key: HIVE-17882
> URL: https://issues.apache.org/jira/browse/HIVE-17882
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Harish Jaiprakash
> Fix For: 3.0.0
>
> Attachments: HIVE-17882.01.patch
>
>
> {code}
> 0: jdbc:hive2://localhost:1> show resource plan global;
> +--+-++
> | rp_name  | status  | query_parallelism  |
> +--+-++
> | global   | 1   | NULL   |
> +--+-++
> {code}
> looks like status and query_parallelism got swapped.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18002) add group support for pool mappings

2017-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250705#comment-16250705
 ] 

Hive QA commented on HIVE-18002:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897416/HIVE-18002.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11384 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ppd_union_view]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7796/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7796/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7796/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897416 - PreCommit-HIVE-Build

> add group support for pool mappings
> ---
>
> Key: HIVE-18002
> URL: https://issues.apache.org/jira/browse/HIVE-18002
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18002.01.patch, HIVE-18002.02.patch, 
> HIVE-18002.02.patch, HIVE-18002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15436) Enhancing metastore APIs to retrieve only materialized views

2017-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15436:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Removed the methods and pushed to master, thanks for reviewing [~ashutoshc]!

> Enhancing metastore APIs to retrieve only materialized views
> 
>
> Key: HIVE-15436
> URL: https://issues.apache.org/jira/browse/HIVE-15436
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 3.0.0
>
> Attachments: HIVE-15436.patch
>
>
> Enhancing metastore APIs such that, instead of returning all tables, it can 
> return only:
> - views
> - materialized views



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-14535) add insert-only ACID tables to Hive

2017-11-13 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14535:
--
Labels: TODOC3.0  (was: )

> add insert-only ACID tables to Hive 
> 
>
> Key: HIVE-14535
> URL: https://issues.apache.org/jira/browse/HIVE-14535
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
>
> Design doc: 
> https://docs.google.com/document/d/1b3t1RywfyRb73-cdvkEzJUyOiekWwkMHdiQ-42zCllY
> Feel free to comment.
> Update: we ended up going with sequence number based implementation
> Update #2: this feature has been partially merged with ACID; the new table 
> type is insert_only ACID, and the difference from the regular ACID is that it 
> only supports inserts on one hand; and that it has no restrictions on file 
> format, table type (bucketing), and much fewer restrictions on other 
> operations (export/import, list bucketing, etc.)
> Currently some features that used to work when it was separated are not 
> integrated properly; integration of these features is the remaining work in 
> this JIRA



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17361) Support LOAD DATA for transactional tables

2017-11-13 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17361:
--
Attachment: HIVE-17361.09.patch

> Support LOAD DATA for transactional tables
> --
>
> Key: HIVE-17361
> URL: https://issues.apache.org/jira/browse/HIVE-17361
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17361.07.patch, HIVE-17361.08.patch, 
> HIVE-17361.09.patch, HIVE-17361.1.patch, HIVE-17361.2.patch, 
> HIVE-17361.3.patch, HIVE-17361.4.patch
>
>
> LOAD DATA was not supported since ACID was introduced. Need to fill this gap 
> between ACID table and regular hive table.
> Current Documentation is under [DML 
> Operations|https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-DMLOperations]
>  and [Loading files into 
> tables|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables]:
> \\
> * Load Data performs very limited validations of the data, in particular it 
> uses the input file name which may not be in 0_0 which can break some 
> read logic.  (Certainly will for Acid).
> * It does not check the schema of the file.  This may be a non issue for Acid 
> which requires ORC which is self describing so Schema Evolution may handle 
> this seamlessly.  (Assuming Schema is not too different).
> * It does check that _InputFormat_S are compatible. 
> * Bucketed (and thus sorted) tables don't support Load Data (but only if 
> hive.strict.checks.bucketing=true (default)).  Will keep this restriction for 
> Acid.
> * Load Data supports OVERWRITE clause
> * What happens to file permissions/ownership: rename vs copy differences
> \\
> The implementation will follow the same idea as in HIVE-14988 and use a 
> base_N/ dir for OVERWRITE clause.
> \\
> How is minor compaction going to handle delta/base with original files?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18055) cache like pattern object using map object in like function

2017-11-13 Thread wan kun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wan kun updated HIVE-18055:
---
Status: Patch Available  (was: In Progress)

> cache like pattern object using map object in like function
> ---
>
> Key: HIVE-18055
> URL: https://issues.apache.org/jira/browse/HIVE-18055
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: wan kun
>Assignee: wan kun
>Priority: Minor
> Fix For: 1.2.3
>
> Attachments: HIVE-18055-branch-1.2.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Now, only one pattern object was cached in the like function. If the like 
> function is working on one column, the pattern object will be generated 
> continuously for the regular expression matching. It's very inefficient. So 
> should we use LRU MAP to cache a batch of objects ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18055) cache like pattern object using map object in like function

2017-11-13 Thread wan kun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wan kun updated HIVE-18055:
---
Attachment: HIVE-18055-branch-1.2.patch

> cache like pattern object using map object in like function
> ---
>
> Key: HIVE-18055
> URL: https://issues.apache.org/jira/browse/HIVE-18055
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: wan kun
>Assignee: wan kun
>Priority: Minor
> Fix For: 1.2.3
>
> Attachments: HIVE-18055-branch-1.2.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Now, only one pattern object was cached in the like function. If the like 
> function is working on one column, the pattern object will be generated 
> continuously for the regular expression matching. It's very inefficient. So 
> should we use LRU MAP to cache a batch of objects ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Work started] (HIVE-18055) cache like pattern object using map object in like function

2017-11-13 Thread wan kun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-18055 started by wan kun.
--
> cache like pattern object using map object in like function
> ---
>
> Key: HIVE-18055
> URL: https://issues.apache.org/jira/browse/HIVE-18055
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: wan kun
>Assignee: wan kun
>Priority: Minor
> Fix For: 1.2.3
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Now, only one pattern object was cached in the like function. If the like 
> function is working on one column, the pattern object will be generated 
> continuously for the regular expression matching. It's very inefficient. So 
> should we use LRU MAP to cache a batch of objects ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-18055) cache like pattern object using map object in like function

2017-11-13 Thread wan kun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wan kun reassigned HIVE-18055:
--


> cache like pattern object using map object in like function
> ---
>
> Key: HIVE-18055
> URL: https://issues.apache.org/jira/browse/HIVE-18055
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: wan kun
>Assignee: wan kun
>Priority: Minor
> Fix For: 1.2.3
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Now, only one pattern object was cached in the like function. If the like 
> function is working on one column, the pattern object will be generated 
> continuously for the regular expression matching. It's very inefficient. So 
> should we use LRU MAP to cache a batch of objects ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17904) handle internal Tez AM restart in registry and WM

2017-11-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17904:

Attachment: HIVE-17904.05.patch

Rebased.

> handle internal Tez AM restart in registry and WM
> -
>
> Key: HIVE-17904
> URL: https://issues.apache.org/jira/browse/HIVE-17904
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17904.01.patch, HIVE-17904.02.patch, 
> HIVE-17904.03.patch, HIVE-17904.04.patch, HIVE-17904.05.patch, 
> HIVE-17904.patch, HIVE-17904.patch
>
>
> After the plan update patch is committed. The current code doesn't account 
> very well for it; registry may have races, and an event needs to be added to 
> WM when some AM resets, at least to make sure we discard the update errors 
> that pertain to the old AM. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-14495) Add SHOW MATERIALIZED VIEWS statement

2017-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250641#comment-16250641
 ] 

Hive QA commented on HIVE-14495:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897375/HIVE-14495.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11384 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ppd_union_view]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveOperationType.checkHiveOperationTypeMatch
 (batchId=286)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7795/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7795/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7795/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897375 - PreCommit-HIVE-Build

> Add SHOW MATERIALIZED VIEWS statement
> -
>
> Key: HIVE-14495
> URL: https://issues.apache.org/jira/browse/HIVE-14495
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14495.patch
>
>
> In the spirit of {{SHOW TABLES}}, we should support the following statement:
> {code:sql}
> SHOW MATERIALIZED VIEWS [IN database_name] ['identifier_with_wildcards'];
> {code}
> In contrast to {{SHOW TABLES}}, this command would only list the materialized 
> views.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-13 Thread Andrew Sherman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-17935:
--
Attachment: HIVE-17935.6.patch

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch, 
> HIVE-17935.3.patch, HIVE-17935.4.patch, HIVE-17935.5.patch, HIVE-17935.6.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18002) add group support for pool mappings

2017-11-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18002:

Attachment: HIVE-18002.02.patch

Rebased, addressed some feedback

> add group support for pool mappings
> ---
>
> Key: HIVE-18002
> URL: https://issues.apache.org/jira/browse/HIVE-18002
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18002.01.patch, HIVE-18002.02.patch, 
> HIVE-18002.02.patch, HIVE-18002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18002) add group support for pool mappings

2017-11-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250624#comment-16250624
 ] 

Sergey Shelukhin commented on HIVE-18002:
-

Where is it created? Also is it the same user UGI? I need a dummy UGI with 
session user, not the current user UGI that could be "hive"

> add group support for pool mappings
> ---
>
> Key: HIVE-18002
> URL: https://issues.apache.org/jira/browse/HIVE-18002
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18002.01.patch, HIVE-18002.02.patch, 
> HIVE-18002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17906) use kill query mechanics to kill queries in WM

2017-11-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17906:

Attachment: HIVE-17906.05.patch

Rebased on top of recent master changes.

> use kill query mechanics to kill queries in WM
> --
>
> Key: HIVE-17906
> URL: https://issues.apache.org/jira/browse/HIVE-17906
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17906.01.patch, HIVE-17906.02.patch, 
> HIVE-17906.03.patch, HIVE-17906.03.patch, HIVE-17906.04.patch, 
> HIVE-17906.05.patch, HIVE-17906.patch
>
>
> Right now it just closes the session (see HIVE-17841). The sessions would 
> need to be reused after the kill, or closed after the kill if the total QP 
> has decreased



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review

2017-11-13 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250591#comment-16250591
 ] 

Ashutosh Chauhan commented on HIVE-18038:
-

Failed test {{TestOperationLoggingAPIWithMr}} looks related.

> org.apache.hadoop.hive.ql.session.OperationLog - Review
> ---
>
> Key: HIVE-18038
> URL: https://issues.apache.org/jira/browse/HIVE-18038
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, 
> HIVE-18038.3.patch
>
>
> Simplifications, improve readability



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15436) Enhancing metastore APIs to retrieve only materialized views

2017-11-13 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250586#comment-16250586
 ] 

Ashutosh Chauhan commented on HIVE-15436:
-

Please remove unneeded public method.
+1 pending that change.

> Enhancing metastore APIs to retrieve only materialized views
> 
>
> Key: HIVE-15436
> URL: https://issues.apache.org/jira/browse/HIVE-15436
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15436.patch
>
>
> Enhancing metastore APIs such that, instead of returning all tables, it can 
> return only:
> - views
> - materialized views



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18046) Metastore: default IS_REWRITE_ENABLED=false instead of NULL

2017-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250544#comment-16250544
 ] 

Hive QA commented on HIVE-18046:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897366/HIVE-18046.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11382 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ppd_union_view]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
org.apache.hive.beeline.TestSchemaTool.testMetastoreDbPropertiesAfterUpgrade 
(batchId=226)
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade (batchId=226)
org.apache.hive.beeline.TestSchemaTool.testValidateSchemaTables (batchId=226)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7794/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7794/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7794/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897366 - PreCommit-HIVE-Build

> Metastore: default IS_REWRITE_ENABLED=false instead of NULL
> ---
>
> Key: HIVE-18046
> URL: https://issues.apache.org/jira/browse/HIVE-18046
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-18046.patch
>
>
> The materialized view impl breaks old metastore sql write access, by 
> complaining that the new table creation does not set this column up.
> {code}
>   `IS_REWRITE_ENABLED` bit(1) NOT NULL,
> {code}
> {{NOT NULL DEFAULT 0}} would allow old metastore direct sql compatibility 
> (not thrift).
> {code}
> 2017-11-09T07:11:58,331 ERROR [HiveServer2-Background-Pool: Thread-2354] 
> metastore.RetryingHMSHandler: Retrying HMSHandler after 2000 ms (attempt 1 of 
> 10) with error: javax.jdo.JDODataStoreException: Insert of object 
> "org.apache.hadoop.hive.metastore.model.MTable@249dbf1" using statement 
> "INSERT INTO `TBLS` 
> (`TBL_ID`,`CREATE_TIME`,`DB_ID`,`LAST_ACCESS_TIME`,`OWNER`,`RETENTION`,`SD_ID`,`TBL_NAME`,`TBL_TYPE`,`VIEW_EXPANDED_TEXT`,`VIEW_ORIGINAL_TEXT`)
>  VALUES (?,?,?,?,?,?,?,?,?,?,?)" failed : Field 'IS_REWRITE_ENABLED' doesn't 
> have a default value
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:720)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:740)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:1038)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17942) HiveAlterHandler not using conf from HMS Handler

2017-11-13 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250536#comment-16250536
 ] 

Vihang Karajgaonkar commented on HIVE-17942:


patch merged to master branch but doesn't apply cleanly on branch-2 Hi 
[~janulatha] can you please attach the branch-2 patch as well?

> HiveAlterHandler not using conf from HMS Handler
> 
>
> Key: HIVE-17942
> URL: https://issues.apache.org/jira/browse/HIVE-17942
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE-17942.1.patch, HIVE-17942.2.patch, 
> HIVE-17942.3.patch, HIVE-17942.4.patch, HIVE-17942.5.patch
>
>
> When HiveAlterHandler looks for conf, it is not getting the one from thread 
> local.  So, local changes are not visible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18002) add group support for pool mappings

2017-11-13 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250535#comment-16250535
 ] 

Prasanth Jayachandran commented on HIVE-18002:
--

UGI gets created during DAG creation as well. Can you reuse that in 
MappingInput as well?
nit: move private static final org.slf4j.Logger LOG to top of class.


> add group support for pool mappings
> ---
>
> Key: HIVE-18002
> URL: https://issues.apache.org/jira/browse/HIVE-18002
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18002.01.patch, HIVE-18002.02.patch, 
> HIVE-18002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17906) use kill query mechanics to kill queries in WM

2017-11-13 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250527#comment-16250527
 ] 

Prasanth Jayachandran commented on HIVE-17906:
--

+1 on new changes

> use kill query mechanics to kill queries in WM
> --
>
> Key: HIVE-17906
> URL: https://issues.apache.org/jira/browse/HIVE-17906
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17906.01.patch, HIVE-17906.02.patch, 
> HIVE-17906.03.patch, HIVE-17906.03.patch, HIVE-17906.04.patch, 
> HIVE-17906.patch
>
>
> Right now it just closes the session (see HIVE-17841). The sessions would 
> need to be reused after the kill, or closed after the kill if the total QP 
> has decreased



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18002) add group support for pool mappings

2017-11-13 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250515#comment-16250515
 ] 

Prasanth Jayachandran commented on HIVE-18002:
--

Can you plz post patch in RB?

> add group support for pool mappings
> ---
>
> Key: HIVE-18002
> URL: https://issues.apache.org/jira/browse/HIVE-18002
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18002.01.patch, HIVE-18002.02.patch, 
> HIVE-18002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17898) Explain plan output enhancement

2017-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17898:
---
Attachment: HIVE-17898.2.patch

> Explain plan output enhancement
> ---
>
> Key: HIVE-17898
> URL: https://issues.apache.org/jira/browse/HIVE-17898
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17898.1.patch, HIVE-17898.2.patch
>
>
> We would like to enhance the explain plan output to display additional 
> information e.g.:
> TableScan operator should have following additional info
> * Actual table name (currently only alias name is displayed)
> * Database name
> * Column names being scanned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17898) Explain plan output enhancement

2017-11-13 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17898:
---
Status: Open  (was: Patch Available)

> Explain plan output enhancement
> ---
>
> Key: HIVE-17898
> URL: https://issues.apache.org/jira/browse/HIVE-17898
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17898.1.patch, HIVE-17898.2.patch
>
>
> We would like to enhance the explain plan output to display additional 
> information e.g.:
> TableScan operator should have following additional info
> * Actual table name (currently only alias name is displayed)
> * Database name
> * Column names being scanned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18054) Make Lineage work with concurrent queries on a Session

2017-11-13 Thread Andrew Sherman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-18054:
--
Status: Patch Available  (was: Open)

>  Make Lineage work with concurrent queries on a Session
> ---
>
> Key: HIVE-18054
> URL: https://issues.apache.org/jira/browse/HIVE-18054
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-18054.1.patch
>
>
> A Hive Session can contain multiple concurrent sql Operations.
> Lineage is currently tracked in SessionState and is cleared when a query 
> completes. This results in Lineage for other running queries being lost.
> To fix this, move LineageState from SessionState to QueryState.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18054) Make Lineage work with concurrent queries on a Session

2017-11-13 Thread Andrew Sherman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-18054:
--
Attachment: HIVE-18054.1.patch

>  Make Lineage work with concurrent queries on a Session
> ---
>
> Key: HIVE-18054
> URL: https://issues.apache.org/jira/browse/HIVE-18054
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-18054.1.patch
>
>
> A Hive Session can contain multiple concurrent sql Operations.
> Lineage is currently tracked in SessionState and is cleared when a query 
> completes. This results in Lineage for other running queries being lost.
> To fix this, move LineageState from SessionState to QueryState.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250477#comment-16250477
 ] 

Hive QA commented on HIVE-17935:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897347/HIVE-17935.5.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 11380 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ppd_union_view]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part_update]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_all_partitioned]
 (batchId=160)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.TestTxnCommands2.testDynamicPartitionsMerge2 
(batchId=274)
org.apache.hadoop.hive.ql.TestTxnCommands2.testMultiInsert (batchId=274)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge2
 (batchId=284)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsert
 (batchId=284)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.schemaEvolutionAddColDynamicPartitioningUpdate
 (batchId=221)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=233)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=233)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsMultiInsert
 (batchId=230)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=230)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7793/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7793/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7793/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897347 - PreCommit-HIVE-Build

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch, 
> HIVE-17935.3.patch, HIVE-17935.4.patch, HIVE-17935.5.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18023) Redact the expression in lineage info

2017-11-13 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18023:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks Yongzhi reviewing the code.

> Redact the expression in lineage info
> -
>
> Key: HIVE-18023
> URL: https://issues.apache.org/jira/browse/HIVE-18023
> Project: Hive
>  Issue Type: Improvement
>  Components: Logging
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HIVE-18023.1.patch
>
>
> The query redactor is redacting the query itself while the expression shown 
> in lineage info is not, which may still expose sensitive info. The following 
> query
> {{select customers.id, customers.name from customers where 
> customers.addresses['shipping'].zip_code ='1234-5678-1234-5678';}} will have 
> a log entry in lineage. The expression should also be redacted.
> {noformat}
> [HiveServer2-Background-Pool: Thread-43]: 
> {"version":"1.0","user":"hive","timestamp":1510179280,"duration":40747,"jobIds":["job_1510150684172_0006"],"engine":"mr","database":"default","hash":"a2b4721a0935e3770d81649d24ab1cd4","queryText":"select
>  customers.id, customers.name from customers where 
> customers.addresses['shipping'].zip_code 
> ='---'","edges":[{"sources":[2],"targets":[0],"edgeType":"PROJECTION"},{"sources":[3],"targets":[1],"edgeType":"PROJECTION"},{"sources":[],"targets":[0,1],"expression":"(addresses['shipping'].zip_code
>  = 
> '1234-5678-1234-5678')","edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"customers.id"},{"id":1,"vertexType":"COLUMN","vertexId":"customers.name"},{"id":2,"vertexType":"COLUMN","vertexId":"default.customers.id"},{"id":3,"vertexType":"COLUMN","vertexId":"default.customers.name"}]}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17964) HoS: some spark configs doesn't require re-creating a session

2017-11-13 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250420#comment-16250420
 ] 

Xuefu Zhang commented on HIVE-17964:


+1

> HoS: some spark configs doesn't require re-creating a session
> -
>
> Key: HIVE-17964
> URL: https://issues.apache.org/jira/browse/HIVE-17964
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Attachments: HIVE-17964.1.patch
>
>
> I guess the {{hive.spark.}} configs were initially intended for the RSC. 
> Therefore when they're changed, we'll re-create the session for them to take 
> effect. There're some configs not related to RSC that also start with 
> {{hive.spark.}}. We'd better rename them so that we don't unnecessarily 
> re-create sessions, which is usually time consuming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17964) HoS: some spark configs doesn't require re-creating a session

2017-11-13 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250420#comment-16250420
 ] 

Xuefu Zhang edited comment on HIVE-17964 at 11/13/17 10:43 PM:
---

+1 pending on tests


was (Author: xuefuz):
+1

> HoS: some spark configs doesn't require re-creating a session
> -
>
> Key: HIVE-17964
> URL: https://issues.apache.org/jira/browse/HIVE-17964
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Attachments: HIVE-17964.1.patch
>
>
> I guess the {{hive.spark.}} configs were initially intended for the RSC. 
> Therefore when they're changed, we'll re-create the session for them to take 
> effect. There're some configs not related to RSC that also start with 
> {{hive.spark.}}. We'd better rename them so that we don't unnecessarily 
> re-create sessions, which is usually time consuming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18009) Multiple lateral view query is slow on hive on spark

2017-11-13 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18009:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks Yongzhi and Xuefu for reviewing.

> Multiple lateral view query is slow on hive on spark
> 
>
> Key: HIVE-18009
> URL: https://issues.apache.org/jira/browse/HIVE-18009
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 3.0.0
>
> Attachments: HIVE-18009.1.patch, HIVE-18009.2.patch, 
> HIVE-18009.3.patch
>
>
> When running the query with multiple lateral view, HoS is busy with the 
> compilation. GenSparkUtils has an efficient implementation of 
> getChildOperator when we have diamond hierarchy in operator trees (lateral 
> view in this case) since the node may be visited multiple times.
> {noformat}
> at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:442)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
>

[jira] [Commented] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2017-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250349#comment-16250349
 ] 

Hive QA commented on HIVE-16855:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897352/HIVE-16855.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11361 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_partitioned_native] 
(batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
 (batchId=176)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testApplyPlanQpChanges 
(batchId=281)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7792/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7792/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7792/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897352 - PreCommit-HIVE-Build

> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch, HIVE-16855.2.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-13 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17417:
-
Fix Version/s: 2.4.0

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, HIVE-17417.5.patch, 
> HIVE-17417.6.patch, date-serialize.png, timestamp-serialize.png, 
> ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17809) Implement per pool trigger validation and move sessions across pools

2017-11-13 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17809:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Test failures are unrelated to this patch. Committed to master. Thanks for the 
reviews!

> Implement per pool trigger validation and move sessions across pools
> 
>
> Key: HIVE-17809
> URL: https://issues.apache.org/jira/browse/HIVE-17809
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 3.0.0
>
> Attachments: HIVE-17809.1.patch, HIVE-17809.2.patch, 
> HIVE-17809.3.patch, HIVE-17809.4.patch, HIVE-17809.5.patch, HIVE-17809.6.patch
>
>
> HIVE-17508 trigger validation is applied for all pools at once. This is 
> follow up to implement trigger validation at per pool level. 
> This should also implement resolution for multiple applicable actions, as per 
> the RB discussion



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17809) Implement per pool trigger validation and move sessions across pools

2017-11-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250295#comment-16250295
 ] 

Sergey Shelukhin commented on HIVE-17809:
-

+1

> Implement per pool trigger validation and move sessions across pools
> 
>
> Key: HIVE-17809
> URL: https://issues.apache.org/jira/browse/HIVE-17809
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17809.1.patch, HIVE-17809.2.patch, 
> HIVE-17809.3.patch, HIVE-17809.4.patch, HIVE-17809.5.patch, HIVE-17809.6.patch
>
>
> HIVE-17508 trigger validation is applied for all pools at once. This is 
> follow up to implement trigger validation at per pool level. 
> This should also implement resolution for multiple applicable actions, as per 
> the RB discussion



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-18054) Make Lineage work with concurrent queries on a Session

2017-11-13 Thread Andrew Sherman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman reassigned HIVE-18054:
-


>  Make Lineage work with concurrent queries on a Session
> ---
>
> Key: HIVE-18054
> URL: https://issues.apache.org/jira/browse/HIVE-18054
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>
> A Hive Session can contain multiple concurrent sql Operations.
> Lineage is currently tracked in SessionState and is cleared when a query 
> completes. This results in Lineage for other running queries being lost.
> To fix this, move LineageState from SessionState to QueryState.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16406) Remove unwanted interning when creating PartitionDesc

2017-11-13 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16406:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Rajesh!

> Remove unwanted interning when creating PartitionDesc
> -
>
> Key: HIVE-16406
> URL: https://issues.apache.org/jira/browse/HIVE-16406
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16406.1.patch, HIVE-16406.2.patch, 
> HIVE-16406.3.patch, HIVE-16406.profiler.png
>
>
> {{PartitionDesc::getTableDesc}} interns all table description properties by 
> default. But the table description properties are already interned and need 
> not be interned again. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17714) move custom SerDe schema considerations into metastore from QL

2017-11-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250271#comment-16250271
 ] 

Sergey Shelukhin commented on HIVE-17714:
-

But the point is that the things are not in sync because they can be changed 
without Hive being aware. The only calls that will be slowed down will be 
getTable/etc., that require schema, and only for tables using custom serdes. 
Esp. if the serde classes themselves are cached, the slowdown would be trivial. 

> move custom SerDe schema considerations into metastore from QL
> --
>
> Key: HIVE-17714
> URL: https://issues.apache.org/jira/browse/HIVE-17714
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Alan Gates
>
> Columns in metastore for tables that use external schema don't have the type 
> information (since HIVE-11985) and may be entirely inconsistent (since 
> forever, due to issues like HIVE-17713; or for SerDes that allow an URL for 
> the schema, due to a change in the underlying file).
> Currently, if you trace the usage of ConfVars.SERDESUSINGMETASTOREFORSCHEMA, 
> and to MetaStoreUtils.getFieldsFromDeserializer, you'd see that the code in 
> QL handles this in Hive. So, for the most part metastore just returns 
> whatever is stored for columns in the database.
> One exception appears to be get_fields_with_environment_context, which is 
> interesting... so getTable will return incorrect columns (potentially), but 
> get_fields/get_schema will return correct ones from SerDe as far as I can 
> tell.
> As part of separating the metastore, we should make sure all the APIs return 
> the correct schema for the columns; it's not a good idea to have everyone 
> reimplement getFieldsFromDeserializer.
> Note: this should also remove a flag introduced in HIVE-17731



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17714) move custom SerDe schema considerations into metastore from QL

2017-11-13 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250273#comment-16250273
 ] 

Alan Gates commented on HIVE-17714:
---

bq. I will try bringing in TypeInfo and ObjectInspector too. What are the 
specific advantages of doing that? 
I think you'll forced to by the interdependencies of the interfaces.  If you 
are not, then fine, we don't have to move them.

bq. Also, I didn't quite understand by "avoids the need for ORC and any other 
storage format to pick it up". Can you please elaborate?
ORC today depends on the storage-api.  It works hard to keep down the number of 
its dependencies in order to minimize its jar size.  So I suspect you'll get 
pushback from the ORC community on adding Serializer et al to the storage-api.  
By making serde interfaces a separate module in storage-api we can address this 
concern from ORC.

bq. This assumes that SerDes implementations do not bring along other 
dependencies like hive-common etc. I am not sure yet but I think it is very 
likely that these SerDes will have more dependencies, so it may not be just 
adding hive-serde.jar to the standalone-metastore classpath. I already see 
hive-serde depends on hive-common, hive-service-rpc and hive-shims so not sure 
if we will be able to create a standalone serde jar for metastore.
Fair point, though even if we could get them to only pull in common, shims, and 
serdes that would be a big improvement over needing the exec jar.



> move custom SerDe schema considerations into metastore from QL
> --
>
> Key: HIVE-17714
> URL: https://issues.apache.org/jira/browse/HIVE-17714
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Alan Gates
>
> Columns in metastore for tables that use external schema don't have the type 
> information (since HIVE-11985) and may be entirely inconsistent (since 
> forever, due to issues like HIVE-17713; or for SerDes that allow an URL for 
> the schema, due to a change in the underlying file).
> Currently, if you trace the usage of ConfVars.SERDESUSINGMETASTOREFORSCHEMA, 
> and to MetaStoreUtils.getFieldsFromDeserializer, you'd see that the code in 
> QL handles this in Hive. So, for the most part metastore just returns 
> whatever is stored for columns in the database.
> One exception appears to be get_fields_with_environment_context, which is 
> interesting... so getTable will return incorrect columns (potentially), but 
> get_fields/get_schema will return correct ones from SerDe as far as I can 
> tell.
> As part of separating the metastore, we should make sure all the APIs return 
> the correct schema for the columns; it's not a good idea to have everyone 
> reimplement getFieldsFromDeserializer.
> Note: this should also remove a flag introduced in HIVE-17731



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17714) move custom SerDe schema considerations into metastore from QL

2017-11-13 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250261#comment-16250261
 ] 

Alan Gates commented on HIVE-17714:
---

bq. We can remove the logic that avoids storing schema in metastore entirely, 
and always store the schema, like before.
No, -1.  For the reasons I gave above.  I'm fine with working on ways at write 
and alter time to make sure things are in sync.  I am not ok with complicating 
the read path.

> move custom SerDe schema considerations into metastore from QL
> --
>
> Key: HIVE-17714
> URL: https://issues.apache.org/jira/browse/HIVE-17714
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Alan Gates
>
> Columns in metastore for tables that use external schema don't have the type 
> information (since HIVE-11985) and may be entirely inconsistent (since 
> forever, due to issues like HIVE-17713; or for SerDes that allow an URL for 
> the schema, due to a change in the underlying file).
> Currently, if you trace the usage of ConfVars.SERDESUSINGMETASTOREFORSCHEMA, 
> and to MetaStoreUtils.getFieldsFromDeserializer, you'd see that the code in 
> QL handles this in Hive. So, for the most part metastore just returns 
> whatever is stored for columns in the database.
> One exception appears to be get_fields_with_environment_context, which is 
> interesting... so getTable will return incorrect columns (potentially), but 
> get_fields/get_schema will return correct ones from SerDe as far as I can 
> tell.
> As part of separating the metastore, we should make sure all the APIs return 
> the correct schema for the columns; it's not a good idea to have everyone 
> reimplement getFieldsFromDeserializer.
> Note: this should also remove a flag introduced in HIVE-17731



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15436) Enhancing metastore APIs to retrieve only materialized views

2017-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250241#comment-16250241
 ] 

Hive QA commented on HIVE-15436:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897357/HIVE-15436.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11374 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7791/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7791/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7791/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897357 - PreCommit-HIVE-Build

> Enhancing metastore APIs to retrieve only materialized views
> 
>
> Key: HIVE-15436
> URL: https://issues.apache.org/jira/browse/HIVE-15436
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15436.patch
>
>
> Enhancing metastore APIs such that, instead of returning all tables, it can 
> return only:
> - views
> - materialized views



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17904) handle internal Tez AM restart in registry and WM

2017-11-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17904:

Attachment: HIVE-17904.04.patch

Fixing a simple NPE...

> handle internal Tez AM restart in registry and WM
> -
>
> Key: HIVE-17904
> URL: https://issues.apache.org/jira/browse/HIVE-17904
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17904.01.patch, HIVE-17904.02.patch, 
> HIVE-17904.03.patch, HIVE-17904.04.patch, HIVE-17904.patch, HIVE-17904.patch
>
>
> After the plan update patch is committed. The current code doesn't account 
> very well for it; registry may have races, and an event needs to be added to 
> WM when some AM resets, at least to make sure we discard the update errors 
> that pertain to the old AM. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18002) add group support for pool mappings

2017-11-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250197#comment-16250197
 ] 

Sergey Shelukhin commented on HIVE-18002:
-

Fixed the NPE... [~prasanth_j] can you take a look?

> add group support for pool mappings
> ---
>
> Key: HIVE-18002
> URL: https://issues.apache.org/jira/browse/HIVE-18002
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18002.01.patch, HIVE-18002.02.patch, 
> HIVE-18002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18002) add group support for pool mappings

2017-11-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18002:

Attachment: HIVE-18002.02.patch

> add group support for pool mappings
> ---
>
> Key: HIVE-18002
> URL: https://issues.apache.org/jira/browse/HIVE-18002
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18002.01.patch, HIVE-18002.02.patch, 
> HIVE-18002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18050) LlapServiceDriver shoud split HIVE_AUX_JARS_PATH by ':' instead of ','

2017-11-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250165#comment-16250165
 ] 

Sergey Shelukhin commented on HIVE-18050:
-

This method is called for multiple sources; the other one is a config value 
which I think is still comma-separated... perhaps the character should be an 
argument to the call.

> LlapServiceDriver shoud split HIVE_AUX_JARS_PATH by ':' instead of ','
> --
>
> Key: HIVE-18050
> URL: https://issues.apache.org/jira/browse/HIVE-18050
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Clients
>Affects Versions: 2.3.0
>Reporter: Aegeaner
>Assignee: Aegeaner
>  Labels: pull-request-available
>
> LlapServiceDriver shoud split HIVE_AUX_JARS_PATH by ':' instead of ',' , 
> since in hive script the environment variable has been replaced:
> {code:java}
> elif [ "${HIVE_AUX_JARS_PATH}" != "" ]; then 
>   HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/,/:/g'`
>   if $cygwin; then
>   HIVE_AUX_JARS_PATH=`cygpath -p -w "$HIVE_AUX_JARS_PATH"`
>   HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/;/,/g'`
>   fi
>   AUX_CLASSPATH=${AUX_CLASSPATH}:${HIVE_AUX_JARS_PATH}
>   AUX_PARAM="file://$(echo ${HIVE_AUX_JARS_PATH} | sed 's/:/,file:\/\//g')"
> fi
> {code}
> But in the LLAP Service Driver, it's processed as :
> {code:java}
>  private void addAuxJarsToSet(HashSet auxJarSet, String auxJars) {
>   if (auxJars != null && !auxJars.isEmpty()) {
> // TODO: transitive dependencies warning?
> String[] jarPaths = auxJars.split(",");
> for (String jarPath : jarPaths) {
>   if (!jarPath.isEmpty()) {
> auxJarSet.add(jarPath);
>   }
> }
>   }
> }
>   };
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17856) MM tables - IOW is not ACID compliant

2017-11-13 Thread Steve Yeom (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250160#comment-16250160
 ] 

Steve Yeom commented on HIVE-17856:
---

Hey, Sergey. Great! Thanks for the info! 
Steve. 

> MM tables - IOW is not ACID compliant
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
> Attachments: HIVE-17856.1.patch, HIVE-17856.2.patch, 
> HIVE-17856.3.patch, HIVE-17856.4.patch, HIVE-17856.5.patch, 
> HIVE-17856.6.patch, HIVE-17856.7.patch, HIVE-17856.8.patch
>
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17856) MM tables - IOW is not ACID compliant

2017-11-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250157#comment-16250157
 ] 

Sergey Shelukhin commented on HIVE-17856:
-

[~steveyeom2017] setting Utilities.FILE_OP_LOGGER level to trace and running 
with just the problematic statement makes testing somewhat better for q files, 
it allows one to see what MM stuff is actually doing

> MM tables - IOW is not ACID compliant
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
> Attachments: HIVE-17856.1.patch, HIVE-17856.2.patch, 
> HIVE-17856.3.patch, HIVE-17856.4.patch, HIVE-17856.5.patch, 
> HIVE-17856.6.patch, HIVE-17856.7.patch, HIVE-17856.8.patch
>
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17714) move custom SerDe schema considerations into metastore from QL

2017-11-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250153#comment-16250153
 ] 

Sergey Shelukhin edited comment on HIVE-17714 at 11/13/17 8:10 PM:
---

Hmm... I was writing the below, when I realized something we might be missing. 
So if this is resolved, the below applies, otherwise none of the above or below 
suggestions work as far as I can tell.
In order to store the derived schema in metastore, wouldn't we need the serde 
jar to be present in the first place? To ask it for the schema. Otherwise if we 
allow users to specify both columns and external schema, we are outsourcing 
even the initial correctness, which seems wrong.
I think it's reasonable to expect that if a SerDe is used, it should be 
available to the user (and metastore). I don't think having extra jars is a 
problem... the user will anyway have to have all the jars to actually query the 
table with the SerDe, right?

 The below (without jars).

My main concern is about ensuring that the schema stored in metastore is synced 
with the actual schema by the serde. These can get out of sync from both sides; 
Hive columns can be added and altered despite the serde being present that is 
responsible for the schema (I filed a jira somewhere to block the modification 
like this) - these modifications will be visible to the users (because of the 
metastore APIs); for most serde-s however they won't reflect on the schema that 
Hive will actually use, so that is confusing.
Some serdes also support schema in external files that we have no control over, 
and other such mechanisms could exist.
Verifying schema at use time solves the problem for Hive, however not for other 
users of the metastore, which is kind of the point - Hive already ignores 
metastore columns for these tables, going instead to the SerDe, so the mismatch 
is not a problem for it. 
And adding such checks in metastore would mean needing access to jars, at which 
point we might as well return the correct schema.
How about this... 
1) We can remove the logic that avoids storing schema in metastore entirely, 
and always store the schema, like before.
2) Metastore will try to get SerDe class on reads, and if available, will 
return the schema from SerDe, or do a check as suggested above.
3) We could add a compat flag (like the one added for MM tables that fails 
getTable/etc calls for them unless the client explicitly claims to support MM 
tables, or disables compat checks)  that will break everyone trying to access 
such tables when the jars are absent (so the client is required to be aware of 
the potential discrepancy) unless they set a config flag to disable checks (so 
they know they might hit some rare issues), or actually implement the 
equivalent of get-from-deserializer.



was (Author: sershe):
Hmm... I was writing the below, when I realized something we might be missing. 
So if this is resolved, the below applies, otherwise none of the above or below 
suggestions work as far as I can tell.
In order to store the derived schema in metastore, wouldn't we need the serde 
jar to be present in the first place? To ask it for the schema. Otherwise if we 
allow users to specify both columns and external schema, we are outsourcing 
even the initial correctness, which seems wrong.
I think it's reasonable to expect that if a SerDe is used, it should be 
available to the user (and metastore). I don't think having extra jars is a 
problem... the user will anyway have to have all the jars to actually query the 
table with the SerDe, right?

 The below (without jars).

My main concern is about ensuring that the schema stored in metastore is synced 
with the actual schema by the serde. These can get out of sync from both sides; 
Hive columns can be added and altered despite the serde being present that is 
responsible for the schema (I filed a jira somewhere to block the modification 
like this) - these modifications will be visible to the users (because of the 
metastore APIs); for most serde-s however they won't reflect on the schema that 
Hive will actually use, so that is confusing.
Some serdes also support schema in external files that we have no control over, 
and other such mechanisms could exist.
Verifying schema at use time solves the problem for Hive, however not for other 
users of the metastore, which is kind of the point - Hive already ignores 
metastore columns for these tables, going instead to the SerDe, so the mismatch 
is not a problem for it. 
And adding such checks in metastore would mean needing access to jars, at which 
point we might as well return the correct schema.
How about this... 
1) We can remove the logic that avoids storing schema in metastore entirely, 
and always store the schema, like before.
2) Metastore will try to get SerDe class on reads, and if available, will 
return the schema from SerDe, or do a compat

[jira] [Commented] (HIVE-17714) move custom SerDe schema considerations into metastore from QL

2017-11-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250153#comment-16250153
 ] 

Sergey Shelukhin commented on HIVE-17714:
-

Hmm... I was writing the below, when I realized something we might be missing. 
So if this is resolved, the below applies, otherwise none of the above or below 
suggestions work as far as I can tell.
In order to store the derived schema in metastore, wouldn't we need the serde 
jar to be present in the first place? To ask it for the schema. Otherwise if we 
allow users to specify both columns and external schema, we are outsourcing 
even the initial correctness, which seems wrong.
I think it's reasonable to expect that if a SerDe is used, it should be 
available to the user (and metastore). I don't think having extra jars is a 
problem... the user will anyway have to have all the jars to actually query the 
table with the SerDe, right?

 The below (without jars).

My main concern is about ensuring that the schema stored in metastore is synced 
with the actual schema by the serde. These can get out of sync from both sides; 
Hive columns can be added and altered despite the serde being present that is 
responsible for the schema (I filed a jira somewhere to block the modification 
like this) - these modifications will be visible to the users (because of the 
metastore APIs); for most serde-s however they won't reflect on the schema that 
Hive will actually use, so that is confusing.
Some serdes also support schema in external files that we have no control over, 
and other such mechanisms could exist.
Verifying schema at use time solves the problem for Hive, however not for other 
users of the metastore, which is kind of the point - Hive already ignores 
metastore columns for these tables, going instead to the SerDe, so the mismatch 
is not a problem for it. 
And adding such checks in metastore would mean needing access to jars, at which 
point we might as well return the correct schema.
How about this... 
1) We can remove the logic that avoids storing schema in metastore entirely, 
and always store the schema, like before.
2) Metastore will try to get SerDe class on reads, and if available, will 
return the schema from SerDe, or do a compat check as suggested above.
3) We could add a compat flag (like the one added for MM tables that fails 
getTable/etc calls for them unless the client explicitly claims to support MM 
tables, or disables compat checks)  that will break everyone trying to access 
such tables when the jars are absent (so the client is required to be aware of 
the potential discrepancy) unless they set a config flag to disable checks (so 
they know they might hit some rare issues), or actually implement the 
equivalent of get-from-deserializer.


> move custom SerDe schema considerations into metastore from QL
> --
>
> Key: HIVE-17714
> URL: https://issues.apache.org/jira/browse/HIVE-17714
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Alan Gates
>
> Columns in metastore for tables that use external schema don't have the type 
> information (since HIVE-11985) and may be entirely inconsistent (since 
> forever, due to issues like HIVE-17713; or for SerDes that allow an URL for 
> the schema, due to a change in the underlying file).
> Currently, if you trace the usage of ConfVars.SERDESUSINGMETASTOREFORSCHEMA, 
> and to MetaStoreUtils.getFieldsFromDeserializer, you'd see that the code in 
> QL handles this in Hive. So, for the most part metastore just returns 
> whatever is stored for columns in the database.
> One exception appears to be get_fields_with_environment_context, which is 
> interesting... so getTable will return incorrect columns (potentially), but 
> get_fields/get_schema will return correct ones from SerDe as far as I can 
> tell.
> As part of separating the metastore, we should make sure all the APIs return 
> the correct schema for the columns; it's not a good idea to have everyone 
> reimplement getFieldsFromDeserializer.
> Note: this should also remove a flag introduced in HIVE-17731



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17714) move custom SerDe schema considerations into metastore from QL

2017-11-13 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250147#comment-16250147
 ] 

Vihang Karajgaonkar commented on HIVE-17714:


Thanks [~alangates] for the response. I have some questions regarding your 
suggestions:

bq. [and I suspect TypeInfo and ObjectInspector will have to come too] to a new 
module in storage-api. This avoids the need for ORC and any other storage 
format to pick it up. 
I will try bringing in TypeInfo and ObjectInspector too. What are the specific 
advantages of doing that? Also, I didn't quite understand by "avoids the need 
for ORC and any other storage format to pick it up". Can you please elaborate?

bq. This will result in a single module that the metastore (and anyone else who 
wants to use Hive serdes) can use without having to pick up all of Hive.
This assumes that SerDes implementations do not bring along other dependencies 
like hive-common etc. I am not sure yet but I think it is very likely that 
these SerDes will have more dependencies, so it may not be just adding 
hive-serde.jar to the standalone-metastore classpath. I already see hive-serde 
depends on hive-common, hive-service-rpc and hive-shims so not sure if we will 
be able to create a standalone serde jar for metastore.

> move custom SerDe schema considerations into metastore from QL
> --
>
> Key: HIVE-17714
> URL: https://issues.apache.org/jira/browse/HIVE-17714
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Alan Gates
>
> Columns in metastore for tables that use external schema don't have the type 
> information (since HIVE-11985) and may be entirely inconsistent (since 
> forever, due to issues like HIVE-17713; or for SerDes that allow an URL for 
> the schema, due to a change in the underlying file).
> Currently, if you trace the usage of ConfVars.SERDESUSINGMETASTOREFORSCHEMA, 
> and to MetaStoreUtils.getFieldsFromDeserializer, you'd see that the code in 
> QL handles this in Hive. So, for the most part metastore just returns 
> whatever is stored for columns in the database.
> One exception appears to be get_fields_with_environment_context, which is 
> interesting... so getTable will return incorrect columns (potentially), but 
> get_fields/get_schema will return correct ones from SerDe as far as I can 
> tell.
> As part of separating the metastore, we should make sure all the APIs return 
> the correct schema for the columns; it's not a good idea to have everyone 
> reimplement getFieldsFromDeserializer.
> Note: this should also remove a flag introduced in HIVE-17731



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review

2017-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250145#comment-16250145
 ] 

Hive QA commented on HIVE-18038:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897351/HIVE-18038.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11374 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
org.apache.hive.service.cli.operation.TestOperationLoggingAPIWithMr.testFetchResultsOfLogWithOrientation
 (batchId=227)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7790/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7790/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7790/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897351 - PreCommit-HIVE-Build

> org.apache.hadoop.hive.ql.session.OperationLog - Review
> ---
>
> Key: HIVE-18038
> URL: https://issues.apache.org/jira/browse/HIVE-18038
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, 
> HIVE-18038.3.patch
>
>
> Simplifications, improve readability



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-15018) ALTER rewriting flag in materialized view

2017-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-15018:
--

Assignee: Jesus Camacho Rodriguez

> ALTER rewriting flag in materialized view 
> --
>
> Key: HIVE-15018
> URL: https://issues.apache.org/jira/browse/HIVE-15018
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> We should extend the ALTER statement in case we want to change the rewriting 
> behavior of the materialized view after we have created it.
> {code:sql}
> ALTER MATERIALIZED VIEW [db_name.]materialized_view_name DISABLE REWRITE;
> {code}
> {code:sql}
> ALTER MATERIALIZED VIEW [db_name.]materialized_view_name ENABLE REWRITE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18046) Metastore: default IS_REWRITE_ENABLED=false instead of NULL

2017-11-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250075#comment-16250075
 ] 

Sergey Shelukhin commented on HIVE-18046:
-

+1 pending tests

> Metastore: default IS_REWRITE_ENABLED=false instead of NULL
> ---
>
> Key: HIVE-18046
> URL: https://issues.apache.org/jira/browse/HIVE-18046
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-18046.patch
>
>
> The materialized view impl breaks old metastore sql write access, by 
> complaining that the new table creation does not set this column up.
> {code}
>   `IS_REWRITE_ENABLED` bit(1) NOT NULL,
> {code}
> {{NOT NULL DEFAULT 0}} would allow old metastore direct sql compatibility 
> (not thrift).
> {code}
> 2017-11-09T07:11:58,331 ERROR [HiveServer2-Background-Pool: Thread-2354] 
> metastore.RetryingHMSHandler: Retrying HMSHandler after 2000 ms (attempt 1 of 
> 10) with error: javax.jdo.JDODataStoreException: Insert of object 
> "org.apache.hadoop.hive.metastore.model.MTable@249dbf1" using statement 
> "INSERT INTO `TBLS` 
> (`TBL_ID`,`CREATE_TIME`,`DB_ID`,`LAST_ACCESS_TIME`,`OWNER`,`RETENTION`,`SD_ID`,`TBL_NAME`,`TBL_TYPE`,`VIEW_EXPANDED_TEXT`,`VIEW_ORIGINAL_TEXT`)
>  VALUES (?,?,?,?,?,?,?,?,?,?,?)" failed : Field 'IS_REWRITE_ENABLED' doesn't 
> have a default value
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:720)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:740)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:1038)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-14499) Add HMS metrics for materialized views

2017-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-14499:
--

Assignee: (was: Jesus Camacho Rodriguez)

> Add HMS metrics for materialized views
> --
>
> Key: HIVE-14499
> URL: https://issues.apache.org/jira/browse/HIVE-14499
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>
> As in HIVE-10761/HIVE-12499.
> We should be able to show some metrics related to materialized views, such as 
> the number of materialized views, size of the materialized views, number of 
> accesses, etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-14499) Add HMS metrics for materialized views

2017-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-14499:
--

Assignee: Jesus Camacho Rodriguez

> Add HMS metrics for materialized views
> --
>
> Key: HIVE-14499
> URL: https://issues.apache.org/jira/browse/HIVE-14499
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> As in HIVE-10761/HIVE-12499.
> We should be able to show some metrics related to materialized views, such as 
> the number of materialized views, size of the materialized views, number of 
> accesses, etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-18053) Support different table types for MVs

2017-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-18053:
--


> Support different table types for MVs
> -
>
> Key: HIVE-18053
> URL: https://issues.apache.org/jira/browse/HIVE-18053
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>
> MVs backed by MM tables, managed tables, external tables. This might work 
> already, but we need to add tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-14495) Add SHOW MATERIALIZED VIEWS statement

2017-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-14495:
--

Assignee: Jesus Camacho Rodriguez

> Add SHOW MATERIALIZED VIEWS statement
> -
>
> Key: HIVE-14495
> URL: https://issues.apache.org/jira/browse/HIVE-14495
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14495.patch
>
>
> In the spirit of {{SHOW TABLES}}, we should support the following statement:
> {code:sql}
> SHOW MATERIALIZED VIEWS [IN database_name] ['identifier_with_wildcards'];
> {code}
> In contrast to {{SHOW TABLES}}, this command would only list the materialized 
> views.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-14495) Add SHOW MATERIALIZED VIEWS statement

2017-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14495:
---
Attachment: HIVE-14495.patch

> Add SHOW MATERIALIZED VIEWS statement
> -
>
> Key: HIVE-14495
> URL: https://issues.apache.org/jira/browse/HIVE-14495
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14495.patch
>
>
> In the spirit of {{SHOW TABLES}}, we should support the following statement:
> {code:sql}
> SHOW MATERIALIZED VIEWS [IN database_name] ['identifier_with_wildcards'];
> {code}
> In contrast to {{SHOW TABLES}}, this command would only list the materialized 
> views.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-14495) Add SHOW MATERIALIZED VIEWS statement

2017-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14495:
---
Status: Patch Available  (was: In Progress)

> Add SHOW MATERIALIZED VIEWS statement
> -
>
> Key: HIVE-14495
> URL: https://issues.apache.org/jira/browse/HIVE-14495
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14495.patch
>
>
> In the spirit of {{SHOW TABLES}}, we should support the following statement:
> {code:sql}
> SHOW MATERIALIZED VIEWS [IN database_name] ['identifier_with_wildcards'];
> {code}
> In contrast to {{SHOW TABLES}}, this command would only list the materialized 
> views.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Work started] (HIVE-14495) Add SHOW MATERIALIZED VIEWS statement

2017-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-14495 started by Jesus Camacho Rodriguez.
--
> Add SHOW MATERIALIZED VIEWS statement
> -
>
> Key: HIVE-14495
> URL: https://issues.apache.org/jira/browse/HIVE-14495
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14495.patch
>
>
> In the spirit of {{SHOW TABLES}}, we should support the following statement:
> {code:sql}
> SHOW MATERIALIZED VIEWS [IN database_name] ['identifier_with_wildcards'];
> {code}
> In contrast to {{SHOW TABLES}}, this command would only list the materialized 
> views.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18052) Run p-tests on mm tables

2017-11-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249997#comment-16249997
 ] 

Hive QA commented on HIVE-18052:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897353/HIVE-18052.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 792 failed/errored test(s), 6865 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_custom_key2]
 (batchId=235)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_custom_key]
 (batchId=235)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=235)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=235)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=235)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=235)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.org.apache.hadoop.hive.cli.TestBeeLineDriver
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[buckets] 
(batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[create_like] 
(batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas_blobstore_to_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas_blobstore_to_hdfs]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas_hdfs_to_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[explain] 
(batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[having] 
(batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_local]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_warehouse]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_local_to_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_blobstore_nonpart]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_local]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_warehouse]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_warehouse_nonpart]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_local_to_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_blobstore_to_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_empty_into_blobstore]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[join2] 
(batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[join] 
(batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[load_data] 
(batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[map_join] 
(batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[map_join_on_filter]
 (batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[multiple_db] 
(batchId=246)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[nested_outer_join]
 (batchId=246)

[jira] [Commented] (HIVE-17714) move custom SerDe schema considerations into metastore from QL

2017-11-13 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249975#comment-16249975
 ] 

Alan Gates commented on HIVE-17714:
---

[~sershe] are you suggesting that all calls to the metastore should rely on 
parsing the schema from the SerDe rather than looking up the column list in the 
metadata?  I would not be in favor that.  That's going to slow down the 
metastore access times and make the code much more complicated.  If you are 
concerned about correctness, it is better to call the SerDe during data write 
time and confirm that the columns written match with the columns specified in 
the metadata (idea credit to [~owen.omalley]).

[~vihangk1]  I propose a couple of modifications to your proposal:

Item 2, we move Serializer, Deserializer, AbstractSerDe [and I suspect TypeInfo 
and ObjectInspector will have to come too] to a *new* module in storage-api.  
This avoids the need for ORC and any other storage format to pick it up.  I 
agree that serde implementations should not become part of the storage-api 
because they are still undergoing lots of development, and that will make the 
release cycle harder in Hive.  Serializer et al APIs are not changing much and 
thus moving them to the storage-api will have a minimal cost for Hive.

I also propose we add a new item 5:  Inside Hive, we work to move all of the 
SerDe implementations from exec to serde module.  We do not change what 
packages the classes are in, just move them into the existing serde module.  
This will result in a single module that the metastore (and anyone else who 
wants to use Hive serdes) can use without having to pick up all of Hive.  The 
standalone metastore still shouldn't directly depend on this serde module (that 
would make a mess of our release process) but users could easily pull it in at 
runtime.  

> move custom SerDe schema considerations into metastore from QL
> --
>
> Key: HIVE-17714
> URL: https://issues.apache.org/jira/browse/HIVE-17714
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Alan Gates
>
> Columns in metastore for tables that use external schema don't have the type 
> information (since HIVE-11985) and may be entirely inconsistent (since 
> forever, due to issues like HIVE-17713; or for SerDes that allow an URL for 
> the schema, due to a change in the underlying file).
> Currently, if you trace the usage of ConfVars.SERDESUSINGMETASTOREFORSCHEMA, 
> and to MetaStoreUtils.getFieldsFromDeserializer, you'd see that the code in 
> QL handles this in Hive. So, for the most part metastore just returns 
> whatever is stored for columns in the database.
> One exception appears to be get_fields_with_environment_context, which is 
> interesting... so getTable will return incorrect columns (potentially), but 
> get_fields/get_schema will return correct ones from SerDe as far as I can 
> tell.
> As part of separating the metastore, we should make sure all the APIs return 
> the correct schema for the columns; it's not a good idea to have everyone 
> reimplement getFieldsFromDeserializer.
> Note: this should also remove a flag introduced in HIVE-17731



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18046) Metastore: default IS_REWRITE_ENABLED=false instead of NULL

2017-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18046:
---
Attachment: HIVE-18046.patch

> Metastore: default IS_REWRITE_ENABLED=false instead of NULL
> ---
>
> Key: HIVE-18046
> URL: https://issues.apache.org/jira/browse/HIVE-18046
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-18046.patch
>
>
> The materialized view impl breaks old metastore sql write access, by 
> complaining that the new table creation does not set this column up.
> {code}
>   `IS_REWRITE_ENABLED` bit(1) NOT NULL,
> {code}
> {{NOT NULL DEFAULT 0}} would allow old metastore direct sql compatibility 
> (not thrift).
> {code}
> 2017-11-09T07:11:58,331 ERROR [HiveServer2-Background-Pool: Thread-2354] 
> metastore.RetryingHMSHandler: Retrying HMSHandler after 2000 ms (attempt 1 of 
> 10) with error: javax.jdo.JDODataStoreException: Insert of object 
> "org.apache.hadoop.hive.metastore.model.MTable@249dbf1" using statement 
> "INSERT INTO `TBLS` 
> (`TBL_ID`,`CREATE_TIME`,`DB_ID`,`LAST_ACCESS_TIME`,`OWNER`,`RETENTION`,`SD_ID`,`TBL_NAME`,`TBL_TYPE`,`VIEW_EXPANDED_TEXT`,`VIEW_ORIGINAL_TEXT`)
>  VALUES (?,?,?,?,?,?,?,?,?,?,?)" failed : Field 'IS_REWRITE_ENABLED' doesn't 
> have a default value
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:720)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:740)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:1038)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18046) Metastore: default IS_REWRITE_ENABLED=false instead of NULL

2017-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18046:
---
Status: Patch Available  (was: In Progress)

> Metastore: default IS_REWRITE_ENABLED=false instead of NULL
> ---
>
> Key: HIVE-18046
> URL: https://issues.apache.org/jira/browse/HIVE-18046
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-18046.patch
>
>
> The materialized view impl breaks old metastore sql write access, by 
> complaining that the new table creation does not set this column up.
> {code}
>   `IS_REWRITE_ENABLED` bit(1) NOT NULL,
> {code}
> {{NOT NULL DEFAULT 0}} would allow old metastore direct sql compatibility 
> (not thrift).
> {code}
> 2017-11-09T07:11:58,331 ERROR [HiveServer2-Background-Pool: Thread-2354] 
> metastore.RetryingHMSHandler: Retrying HMSHandler after 2000 ms (attempt 1 of 
> 10) with error: javax.jdo.JDODataStoreException: Insert of object 
> "org.apache.hadoop.hive.metastore.model.MTable@249dbf1" using statement 
> "INSERT INTO `TBLS` 
> (`TBL_ID`,`CREATE_TIME`,`DB_ID`,`LAST_ACCESS_TIME`,`OWNER`,`RETENTION`,`SD_ID`,`TBL_NAME`,`TBL_TYPE`,`VIEW_EXPANDED_TEXT`,`VIEW_ORIGINAL_TEXT`)
>  VALUES (?,?,?,?,?,?,?,?,?,?,?)" failed : Field 'IS_REWRITE_ENABLED' doesn't 
> have a default value
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:720)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:740)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:1038)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Work started] (HIVE-18046) Metastore: default IS_REWRITE_ENABLED=false instead of NULL

2017-11-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-18046 started by Jesus Camacho Rodriguez.
--
> Metastore: default IS_REWRITE_ENABLED=false instead of NULL
> ---
>
> Key: HIVE-18046
> URL: https://issues.apache.org/jira/browse/HIVE-18046
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>
> The materialized view impl breaks old metastore sql write access, by 
> complaining that the new table creation does not set this column up.
> {code}
>   `IS_REWRITE_ENABLED` bit(1) NOT NULL,
> {code}
> {{NOT NULL DEFAULT 0}} would allow old metastore direct sql compatibility 
> (not thrift).
> {code}
> 2017-11-09T07:11:58,331 ERROR [HiveServer2-Background-Pool: Thread-2354] 
> metastore.RetryingHMSHandler: Retrying HMSHandler after 2000 ms (attempt 1 of 
> 10) with error: javax.jdo.JDODataStoreException: Insert of object 
> "org.apache.hadoop.hive.metastore.model.MTable@249dbf1" using statement 
> "INSERT INTO `TBLS` 
> (`TBL_ID`,`CREATE_TIME`,`DB_ID`,`LAST_ACCESS_TIME`,`OWNER`,`RETENTION`,`SD_ID`,`TBL_NAME`,`TBL_TYPE`,`VIEW_EXPANDED_TEXT`,`VIEW_ORIGINAL_TEXT`)
>  VALUES (?,?,?,?,?,?,?,?,?,?,?)" failed : Field 'IS_REWRITE_ENABLED' doesn't 
> have a default value
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:720)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:740)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:1038)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-18051) qfiles: dataset support

2017-11-13 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249928#comment-16249928
 ] 

Vihang Karajgaonkar edited comment on HIVE-18051 at 11/13/17 6:19 PM:
--

Thanks [~kgyrtkirk] for reporting this. I think its a great idea, the current 
setup of the qtest is pretty heavy and most of the tests don't really need all 
the set up tables. Having lightweight setup where needed  would help us 
reducing the execution time. We should be careful not to batch the heavy setup 
qtests with the lighter ones since that might overshadow the benefits.

[~pvary] has spent some time with the beeline driver and he may have some 
inputs of using beeline to run qtests regarding stability.


was (Author: vihangk1):
Thanks [~kgyrtkirk] for reporting this. I think its a great idea, the current 
setup of the qtest is pretty heavy and most of the tests don't really need all 
the set up tables. Having lightweight setup where needed  would help us 
reducing the execution time. We should be careful not to batch the heavy setup 
qtests with the lighter ones since that might overshadow the benefits.

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

1 2 >

1 - 100 of 143 matches

Mail list logo