[jira] [Commented] (HIVE-16731) Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null end" that involves a column name or expression THEN or ELSE vectorize

2017-06-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051409#comment-16051409
 ] 

Hive QA commented on HIVE-16731:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12873212/HIVE-16731.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10831 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=232)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=216)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5657/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5657/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5657/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12873212 - PreCommit-HIVE-Build

> Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null 
> end" that involves a column name or expression THEN or ELSE vectorize
> ---
>
> Key: HIVE-16731
> URL: https://issues.apache.org/jira/browse/HIVE-16731
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: HIVE-16731.1.patch, HIVE-16731.2.patch, 
> HIVE-16731.3.patch
>
>
> Currently, CASE WHEN statements like that become VectorUDFAdaptor expressions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-06-15 Thread Alexander Kolbasov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051394#comment-16051394
 ] 

Alexander Kolbasov commented on HIVE-16886:
---

For DataNucleus this is orthogonal to transaction isolation level and can be 
enabled for specific transactions. 

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-06-15 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051390#comment-16051390
 ] 

anishek commented on HIVE-16886:


Not sure which transaction isolation level across sql db's (that metastore 
supports) translates to optimistic transactions. I have used this with 
hibernate versioning and it worked very well. I saw this on DataNucleus as well 
so I think will require some work to identify how well it fits our use case. 
 

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16785) Ensure replication actions are idempotent if any series of events are applied again.

2017-06-15 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16785:

Status: Patch Available  (was: Open)

> Ensure replication actions are idempotent if any series of events are applied 
> again.
> 
>
> Key: HIVE-16785
> URL: https://issues.apache.org/jira/browse/HIVE-16785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16785.01.patch, HIVE-16785.02.patch
>
>
> Some of the events(ALTER, RENAME, TRUNCATE) are not idempotent and hence 
> leads to failure of REPL LOAD if applied twice or applied on an object which 
> is latest than current event. For example, if TRUNCATE is applied on a table 
> which is already dropped will fail instead of noop.
> Also, need to consider the scenario where the object is missing while 
> applying an event. For example, if RENAME_TABLE event is applied on target 
> where the old table is missing should validate if table should be recreated 
> or should treat the event as noop. This can be done by verifying the DB level 
> last repl ID against the current event ID.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16731) Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null end" that involves a column name or expression THEN or ELSE vectorize

2017-06-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051370#comment-16051370
 ] 

Hive QA commented on HIVE-16731:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12873211/HIVE-16731.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 10817 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=232)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=101)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testExchangePartition 
(batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=216)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=220)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5656/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5656/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5656/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12873211 - PreCommit-HIVE-Build

> Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null 
> end" that involves a column name or expression THEN or ELSE vectorize
> ---
>
> Key: HIVE-16731
> URL: https://issues.apache.org/jira/browse/HIVE-16731
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: HIVE-16731.1.patch, HIVE-16731.2.patch, 
> HIVE-16731.3.patch
>
>
> Currently, CASE WHEN statements like that become VectorUDFAdaptor expressions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16876) HoS: Make Rpc configs immutable at runtime

2017-06-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051369#comment-16051369
 ] 

Lefty Leverenz commented on HIVE-16876:
---

Doc note:  This adds eight hive.spark.client.* configs to the defaults for 
*hive.conf.restricted.list* so the wiki needs to be updated for release 3.0.0.

* [Configuration Properties -- hive.conf.restricted.list | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.conf.restricted.list]

Added a TODOC3.0 label.

> HoS: Make Rpc configs immutable at runtime
> --
>
> Key: HIVE-16876
> URL: https://issues.apache.org/jira/browse/HIVE-16876
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-16876.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16876) HoS: Make Rpc configs immutable at runtime

2017-06-15 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-16876:
--
Labels: TODOC3.0  (was: )

> HoS: Make Rpc configs immutable at runtime
> --
>
> Key: HIVE-16876
> URL: https://issues.apache.org/jira/browse/HIVE-16876
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-16876.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16785) Ensure replication actions are idempotent if any series of events are applied again.

2017-06-15 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16785:

Status: Open  (was: Patch Available)

> Ensure replication actions are idempotent if any series of events are applied 
> again.
> 
>
> Key: HIVE-16785
> URL: https://issues.apache.org/jira/browse/HIVE-16785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16785.01.patch, HIVE-16785.02.patch
>
>
> Some of the events(ALTER, RENAME, TRUNCATE) are not idempotent and hence 
> leads to failure of REPL LOAD if applied twice or applied on an object which 
> is latest than current event. For example, if TRUNCATE is applied on a table 
> which is already dropped will fail instead of noop.
> Also, need to consider the scenario where the object is missing while 
> applying an event. For example, if RENAME_TABLE event is applied on target 
> where the old table is missing should validate if table should be recreated 
> or should treat the event as noop. This can be done by verifying the DB level 
> last repl ID against the current event ID.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16731) Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null end" that involves a column name or expression THEN or ELSE vectorize

2017-06-15 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-16731:
--
Attachment: HIVE-16731.3.patch

This third patch added Apache Software License header in new classes.

> Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null 
> end" that involves a column name or expression THEN or ELSE vectorize
> ---
>
> Key: HIVE-16731
> URL: https://issues.apache.org/jira/browse/HIVE-16731
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: HIVE-16731.1.patch, HIVE-16731.2.patch, 
> HIVE-16731.3.patch
>
>
> Currently, CASE WHEN statements like that become VectorUDFAdaptor expressions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16731) Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null end" that involves a column name or expression THEN or ELSE vectorize

2017-06-15 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-16731:
--
Attachment: HIVE-16731.2.patch

This second patch includes the changes in test output files.

> Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null 
> end" that involves a column name or expression THEN or ELSE vectorize
> ---
>
> Key: HIVE-16731
> URL: https://issues.apache.org/jira/browse/HIVE-16731
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: HIVE-16731.1.patch, HIVE-16731.2.patch
>
>
> Currently, CASE WHEN statements like that become VectorUDFAdaptor expressions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16876) HoS: Make Rpc configs immutable at runtime

2017-06-15 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-16876:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks Xuefu!

> HoS: Make Rpc configs immutable at runtime
> --
>
> Key: HIVE-16876
> URL: https://issues.apache.org/jira/browse/HIVE-16876
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Fix For: 3.0.0
>
> Attachments: HIVE-16876.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16876) HoS: Make Rpc configs immutable at runtime

2017-06-15 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-16876:
--
Summary: HoS: Make Rpc configs immutable at runtime  (was: RpcServer should 
be re-created when Rpc configs change)

> HoS: Make Rpc configs immutable at runtime
> --
>
> Key: HIVE-16876
> URL: https://issues.apache.org/jira/browse/HIVE-16876
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-16876.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-06-15 Thread Alexander Kolbasov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051154#comment-16051154
 ] 

Alexander Kolbasov commented on HIVE-16886:
---

This is used in connection with Optimistic transactions are they used by 
ObjectStore? I was investigating the use of application-based versioning a 
while ago but couldn't find a single example of how it could be used.

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16911) Upgrade groovy version to 2.4.11

2017-06-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051122#comment-16051122
 ] 

Hive QA commented on HIVE-16911:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12873171/HIVE-16911.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10825 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=232)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=239)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=216)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5655/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5655/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5655/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12873171 - PreCommit-HIVE-Build

> Upgrade groovy version to 2.4.11
> 
>
> Key: HIVE-16911
> URL: https://issues.apache.org/jira/browse/HIVE-16911
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16911.1.patch
>
>
> Hive currently uses groovy 2.4.4 which has security issue 
> (https://access.redhat.com/security/cve/cve-2016-6814). Need to upgrade to 
> 2.4.8 or later. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-16912) Improve table validator's performance against Oracle

2017-06-15 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-16912:



> Improve table validator's performance against Oracle
> 
>
> Key: HIVE-16912
> URL: https://issues.apache.org/jira/browse/HIVE-16912
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
>
> Currently, this validator uses DatabaseMetaData.getTables() that takes in the 
> order of minutes to return because of the number of SYSTEM tables present in 
> Oracle.
> Providing a schema name via a system property would limit the number of 
> tables being returned and thus improve performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-5988) document hive audit log

2017-06-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051083#comment-16051083
 ] 

Lefty Leverenz commented on HIVE-5988:
--

[~thejas], would you please review the doc and close this jira?  (Or suggest 
improvements.)

* [Getting Started -- Audit Logs | 
https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-AuditLogs]

> document hive audit log
> ---
>
> Key: HIVE-5988
> URL: https://issues.apache.org/jira/browse/HIVE-5988
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, Logging
>Reporter: Thejas M Nair
>Assignee: Lefty Leverenz
>
> See HIVE-1948, HIVE-3505 .
> Audit logs are logged from hive metastore server, for every metastore api 
> invocation. It has the function and some of the relevant function arguments 
> logged in the metastore log file. 
> It is logged at the INFO level of log4j, so you need to make sure that the 
> logging  at INFO level is enabled.
> The name of the log entry is "HiveMetaStore.audit" .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-5988) document hive audit log

2017-06-15 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz reassigned HIVE-5988:


Assignee: Lefty Leverenz

> document hive audit log
> ---
>
> Key: HIVE-5988
> URL: https://issues.apache.org/jira/browse/HIVE-5988
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, Logging
>Reporter: Thejas M Nair
>Assignee: Lefty Leverenz
>
> See HIVE-1948, HIVE-3505 .
> Audit logs are logged from hive metastore server, for every metastore api 
> invocation. It has the function and some of the relevant function arguments 
> logged in the metastore log file. 
> It is logged at the INFO level of log4j, so you need to make sure that the 
> logging  at INFO level is enabled.
> The name of the log entry is "HiveMetaStore.audit" .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16849) Upgrade jetty version to 9.4.6.v20170531

2017-06-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051060#comment-16051060
 ] 

Hive QA commented on HIVE-16849:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12873166/HIVE-16849.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10831 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=232)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=216)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5654/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5654/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5654/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12873166 - PreCommit-HIVE-Build

> Upgrade jetty version to 9.4.6.v20170531
> 
>
> Key: HIVE-16849
> URL: https://issues.apache.org/jira/browse/HIVE-16849
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16849.1.patch
>
>
> From HIVE-16846, the test case of TestJdbcWithMiniHS2#testHttpHeaderSize is 
> returning http error code 413 (PAYLOAD_TOO_LARGE_413) rather than 431 
> (REQUEST_HEADER_FIELDS_TOO_LARGE_431 ) while 431 seems more accurate and the 
> newer version of jetty fixed such issue.
> {noformat}
> // This should fail with given HTTP response code 413 in error message, 
> since header is more
> // than the configured the header size
> userName = StringUtils.leftPad("*", 2000);
> try {
>   conn = getConnection(miniHS2.getJdbcURL(testDbName), userName, 
> "password");
> } catch (Exception e) {
>   assertTrue("Header exception thrown", e != null);
>   assertTrue(e.getMessage().contains("HTTP Response code: 413"));
> } finally {
>   if (conn != null) {
> conn.close();
>   }
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16910) RpcConfiguration - Improper Cast From Long To Int

2017-06-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050998#comment-16050998
 ] 

Hive QA commented on HIVE-16910:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12873164/HIVE-16910.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10817 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=232)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=105)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=216)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5653/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5653/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5653/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12873164 - PreCommit-HIVE-Build

> RpcConfiguration - Improper Cast From Long To Int
> -
>
> Key: HIVE-16910
> URL: https://issues.apache.org/jira/browse/HIVE-16910
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16910.1.patch
>
>
> {code:title=org.apache.hive.spark.client.rpc.RpcConfiguration}
>   long getConnectTimeoutMs() {
> String value = 
> config.get(HiveConf.ConfVars.SPARK_RPC_CLIENT_CONNECT_TIMEOUT.varname);
> return value != null ? Integer.parseInt(value) : DEFAULT_CONF.getTimeVar(
>   HiveConf.ConfVars.SPARK_RPC_CLIENT_CONNECT_TIMEOUT, 
> TimeUnit.MILLISECONDS);
>   }
> {code}
> This method returns a *long* but it is trying to parse the configuration 
> string into an *int*.
>  {{Integer.parseInt}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16911) Upgrade groovy version to 2.4.11

2017-06-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-16911:

Status: Patch Available  (was: Open)

> Upgrade groovy version to 2.4.11
> 
>
> Key: HIVE-16911
> URL: https://issues.apache.org/jira/browse/HIVE-16911
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16911.1.patch
>
>
> Hive currently uses groovy 2.4.4 which has security issue 
> (https://access.redhat.com/security/cve/cve-2016-6814). Need to upgrade to 
> 2.4.8 or later. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16911) Upgrade groovy version to 2.4.11

2017-06-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-16911:

Attachment: HIVE-16911.1.patch

> Upgrade groovy version to 2.4.11
> 
>
> Key: HIVE-16911
> URL: https://issues.apache.org/jira/browse/HIVE-16911
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16911.1.patch
>
>
> Hive currently uses groovy 2.4.4 which has security issue 
> (https://access.redhat.com/security/cve/cve-2016-6814). Need to upgrade to 
> 2.4.8 or later. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-16911) Upgrade groovy version to 2.4.11

2017-06-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-16911:
---


> Upgrade groovy version to 2.4.11
> 
>
> Key: HIVE-16911
> URL: https://issues.apache.org/jira/browse/HIVE-16911
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> Hive currently uses groovy 2.4.4 which has security issue 
> (https://access.redhat.com/security/cve/cve-2016-6814). Need to upgrade to 
> 2.4.8 or later. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16849) Upgrade jetty version to 9.4.6.v20170531

2017-06-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-16849:

Status: Patch Available  (was: In Progress)

> Upgrade jetty version to 9.4.6.v20170531
> 
>
> Key: HIVE-16849
> URL: https://issues.apache.org/jira/browse/HIVE-16849
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16849.1.patch
>
>
> From HIVE-16846, the test case of TestJdbcWithMiniHS2#testHttpHeaderSize is 
> returning http error code 413 (PAYLOAD_TOO_LARGE_413) rather than 431 
> (REQUEST_HEADER_FIELDS_TOO_LARGE_431 ) while 431 seems more accurate and the 
> newer version of jetty fixed such issue.
> {noformat}
> // This should fail with given HTTP response code 413 in error message, 
> since header is more
> // than the configured the header size
> userName = StringUtils.leftPad("*", 2000);
> try {
>   conn = getConnection(miniHS2.getJdbcURL(testDbName), userName, 
> "password");
> } catch (Exception e) {
>   assertTrue("Header exception thrown", e != null);
>   assertTrue(e.getMessage().contains("HTTP Response code: 413"));
> } finally {
>   if (conn != null) {
> conn.close();
>   }
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16849) Upgrade jetty version to 9.4.6.v20170531

2017-06-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-16849:

Attachment: (was: HIVE-16849.1.patch)

> Upgrade jetty version to 9.4.6.v20170531
> 
>
> Key: HIVE-16849
> URL: https://issues.apache.org/jira/browse/HIVE-16849
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16849.1.patch
>
>
> From HIVE-16846, the test case of TestJdbcWithMiniHS2#testHttpHeaderSize is 
> returning http error code 413 (PAYLOAD_TOO_LARGE_413) rather than 431 
> (REQUEST_HEADER_FIELDS_TOO_LARGE_431 ) while 431 seems more accurate and the 
> newer version of jetty fixed such issue.
> {noformat}
> // This should fail with given HTTP response code 413 in error message, 
> since header is more
> // than the configured the header size
> userName = StringUtils.leftPad("*", 2000);
> try {
>   conn = getConnection(miniHS2.getJdbcURL(testDbName), userName, 
> "password");
> } catch (Exception e) {
>   assertTrue("Header exception thrown", e != null);
>   assertTrue(e.getMessage().contains("HTTP Response code: 413"));
> } finally {
>   if (conn != null) {
> conn.close();
>   }
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16849) Upgrade jetty version to 9.4.6.v20170531

2017-06-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-16849:

Attachment: HIVE-16849.1.patch

> Upgrade jetty version to 9.4.6.v20170531
> 
>
> Key: HIVE-16849
> URL: https://issues.apache.org/jira/browse/HIVE-16849
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16849.1.patch
>
>
> From HIVE-16846, the test case of TestJdbcWithMiniHS2#testHttpHeaderSize is 
> returning http error code 413 (PAYLOAD_TOO_LARGE_413) rather than 431 
> (REQUEST_HEADER_FIELDS_TOO_LARGE_431 ) while 431 seems more accurate and the 
> newer version of jetty fixed such issue.
> {noformat}
> // This should fail with given HTTP response code 413 in error message, 
> since header is more
> // than the configured the header size
> userName = StringUtils.leftPad("*", 2000);
> try {
>   conn = getConnection(miniHS2.getJdbcURL(testDbName), userName, 
> "password");
> } catch (Exception e) {
>   assertTrue("Header exception thrown", e != null);
>   assertTrue(e.getMessage().contains("HTTP Response code: 413"));
> } finally {
>   if (conn != null) {
> conn.close();
>   }
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16849) Upgrade jetty version to 9.4.6.v20170531

2017-06-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-16849:

Status: In Progress  (was: Patch Available)

> Upgrade jetty version to 9.4.6.v20170531
> 
>
> Key: HIVE-16849
> URL: https://issues.apache.org/jira/browse/HIVE-16849
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16849.1.patch
>
>
> From HIVE-16846, the test case of TestJdbcWithMiniHS2#testHttpHeaderSize is 
> returning http error code 413 (PAYLOAD_TOO_LARGE_413) rather than 431 
> (REQUEST_HEADER_FIELDS_TOO_LARGE_431 ) while 431 seems more accurate and the 
> newer version of jetty fixed such issue.
> {noformat}
> // This should fail with given HTTP response code 413 in error message, 
> since header is more
> // than the configured the header size
> userName = StringUtils.leftPad("*", 2000);
> try {
>   conn = getConnection(miniHS2.getJdbcURL(testDbName), userName, 
> "password");
> } catch (Exception e) {
>   assertTrue("Header exception thrown", e != null);
>   assertTrue(e.getMessage().contains("HTTP Response code: 413"));
> } finally {
>   if (conn != null) {
> conn.close();
>   }
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-16910) RpcConfiguration - Improper Cast From Long To Int

2017-06-15 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned HIVE-16910:
--

Assignee: BELUGA BEHR

> RpcConfiguration - Improper Cast From Long To Int
> -
>
> Key: HIVE-16910
> URL: https://issues.apache.org/jira/browse/HIVE-16910
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16910.1.patch
>
>
> {code:title=org.apache.hive.spark.client.rpc.RpcConfiguration}
>   long getConnectTimeoutMs() {
> String value = 
> config.get(HiveConf.ConfVars.SPARK_RPC_CLIENT_CONNECT_TIMEOUT.varname);
> return value != null ? Integer.parseInt(value) : DEFAULT_CONF.getTimeVar(
>   HiveConf.ConfVars.SPARK_RPC_CLIENT_CONNECT_TIMEOUT, 
> TimeUnit.MILLISECONDS);
>   }
> {code}
> This method returns a *long* but it is trying to parse the configuration 
> string into an *int*.
>  {{Integer.parseInt}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16910) RpcConfiguration - Improper Cast From Long To Int

2017-06-15 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16910:
---
Attachment: HIVE-16910.1.patch

> RpcConfiguration - Improper Cast From Long To Int
> -
>
> Key: HIVE-16910
> URL: https://issues.apache.org/jira/browse/HIVE-16910
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16910.1.patch
>
>
> {code:title=org.apache.hive.spark.client.rpc.RpcConfiguration}
>   long getConnectTimeoutMs() {
> String value = 
> config.get(HiveConf.ConfVars.SPARK_RPC_CLIENT_CONNECT_TIMEOUT.varname);
> return value != null ? Integer.parseInt(value) : DEFAULT_CONF.getTimeVar(
>   HiveConf.ConfVars.SPARK_RPC_CLIENT_CONNECT_TIMEOUT, 
> TimeUnit.MILLISECONDS);
>   }
> {code}
> This method returns a *long* but it is trying to parse the configuration 
> string into an *int*.
>  {{Integer.parseInt}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16910) RpcConfiguration - Improper Cast From Long To Int

2017-06-15 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16910:
---
Status: Patch Available  (was: Open)

> RpcConfiguration - Improper Cast From Long To Int
> -
>
> Key: HIVE-16910
> URL: https://issues.apache.org/jira/browse/HIVE-16910
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16910.1.patch
>
>
> {code:title=org.apache.hive.spark.client.rpc.RpcConfiguration}
>   long getConnectTimeoutMs() {
> String value = 
> config.get(HiveConf.ConfVars.SPARK_RPC_CLIENT_CONNECT_TIMEOUT.varname);
> return value != null ? Integer.parseInt(value) : DEFAULT_CONF.getTimeVar(
>   HiveConf.ConfVars.SPARK_RPC_CLIENT_CONNECT_TIMEOUT, 
> TimeUnit.MILLISECONDS);
>   }
> {code}
> This method returns a *long* but it is trying to parse the configuration 
> string into an *int*.
>  {{Integer.parseInt}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16785) Ensure replication actions are idempotent if any series of events are applied again.

2017-06-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050816#comment-16050816
 ] 

Hive QA commented on HIVE-16785:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12873126/HIVE-16785.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10833 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=232)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=216)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5652/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5652/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5652/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12873126 - PreCommit-HIVE-Build

> Ensure replication actions are idempotent if any series of events are applied 
> again.
> 
>
> Key: HIVE-16785
> URL: https://issues.apache.org/jira/browse/HIVE-16785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16785.01.patch, HIVE-16785.02.patch
>
>
> Some of the events(ALTER, RENAME, TRUNCATE) are not idempotent and hence 
> leads to failure of REPL LOAD if applied twice or applied on an object which 
> is latest than current event. For example, if TRUNCATE is applied on a table 
> which is already dropped will fail instead of noop.
> Also, need to consider the scenario where the object is missing while 
> applying an event. For example, if RENAME_TABLE event is applied on target 
> where the old table is missing should validate if table should be recreated 
> or should treat the event as noop. This can be done by verifying the DB level 
> last repl ID against the current event ID.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16902) investigate "failed to remove operation log" errors

2017-06-15 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050792#comment-16050792
 ] 

Yongzhi Chen commented on HIVE-16902:
-

+1

> investigate "failed to remove operation log" errors
> ---
>
> Key: HIVE-16902
> URL: https://issues.apache.org/jira/browse/HIVE-16902
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16902.1.patch
>
>
> When we call {{set a=3;}} from beeline, the following exception is thrown. 
> {noformat}
> [HiveServer2-Handler-Pool: Thread-46]: Failed to remove corresponding log 
> file of operation: OperationHandle [opType=GET_TABLES, 
> getHandleIdentifier()=50f58d7b-f935-4590-922f-de7051a34658]
> java.io.FileNotFoundException: File does not exist: 
> /var/log/hive/operation_logs/7f613077-e29d-484a-96e1-43c81f9c0999/hive_20170531101400_28d52b7d-ffb9-4815-8c6c-662319628915
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2275)
>   at 
> org.apache.hadoop.hive.ql.session.OperationLog$LogFile.remove(OperationLog.java:122)
>   at 
> org.apache.hadoop.hive.ql.session.OperationLog.close(OperationLog.java:90)
>   at 
> org.apache.hive.service.cli.operation.Operation.cleanupOperationLog(Operation.java:287)
>   at 
> org.apache.hive.service.cli.operation.MetadataOperation.close(MetadataOperation.java:58)
>   at 
> org.apache.hive.service.cli.operation.OperationManager.closeOperation(OperationManager.java:273)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.closeOperation(HiveSessionImpl.java:822)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>   at com.sun.proxy.$Proxy38.closeOperation(Unknown Source)
>   at 
> org.apache.hive.service.cli.CLIService.closeOperation(CLIService.java:475)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.CloseOperation(ThriftCLIService.java:671)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1677)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1662)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:605)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16731) Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null end" that involves a column name or expression THEN or ELSE vectorize

2017-06-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050732#comment-16050732
 ] 

Hive QA commented on HIVE-16731:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12873123/HIVE-16731.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 10817 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=237)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_case] 
(batchId=55)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_adaptor_usage_mode]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_grouping_sets_grouping]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_ptf_part_simple]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_case]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=232)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=104)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_case] 
(batchId=124)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=216)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5651/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5651/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5651/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12873123 - PreCommit-HIVE-Build

> Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null 
> end" that involves a column name or expression THEN or ELSE vectorize
> ---
>
> Key: HIVE-16731
> URL: https://issues.apache.org/jira/browse/HIVE-16731
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: HIVE-16731.1.patch
>
>
> Currently, CASE WHEN statements like that become VectorUDFAdaptor expressions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16731) Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null end" that involves a column name or expression THEN or ELSE vectorize

2017-06-15 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050647#comment-16050647
 ] 

Matt McCline commented on HIVE-16731:
-

Great!  LGTM +1 tests pending.

> Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null 
> end" that involves a column name or expression THEN or ELSE vectorize
> ---
>
> Key: HIVE-16731
> URL: https://issues.apache.org/jira/browse/HIVE-16731
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: HIVE-16731.1.patch
>
>
> Currently, CASE WHEN statements like that become VectorUDFAdaptor expressions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-06-15 Thread Barna Zsombor Klara (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050639#comment-16050639
 ] 

Barna Zsombor Klara commented on HIVE-16357:


Failing tests have been failing for 5-39 runs.

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482)
>   ... 20 more
> Caused by: java.io.FileNotFoundException: File file:/.../0 does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:429)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1555)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:574)
>   at 
> 

[jira] [Commented] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-06-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050634#comment-16050634
 ] 

Hive QA commented on HIVE-16357:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12873122/HIVE-16357.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10831 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=232)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=216)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=216)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5650/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5650/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5650/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12873122 - PreCommit-HIVE-Build

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at 

[jira] [Commented] (HIVE-16902) investigate "failed to remove operation log" errors

2017-06-15 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050567#comment-16050567
 ] 

Aihua Xu commented on HIVE-16902:
-

Without this patch, the exception could happen when the log file never gets 
created or the file could get removed outside Hive manually. We don't need 
throw exception during file delete if the file doesn't exist. 

> investigate "failed to remove operation log" errors
> ---
>
> Key: HIVE-16902
> URL: https://issues.apache.org/jira/browse/HIVE-16902
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16902.1.patch
>
>
> When we call {{set a=3;}} from beeline, the following exception is thrown. 
> {noformat}
> [HiveServer2-Handler-Pool: Thread-46]: Failed to remove corresponding log 
> file of operation: OperationHandle [opType=GET_TABLES, 
> getHandleIdentifier()=50f58d7b-f935-4590-922f-de7051a34658]
> java.io.FileNotFoundException: File does not exist: 
> /var/log/hive/operation_logs/7f613077-e29d-484a-96e1-43c81f9c0999/hive_20170531101400_28d52b7d-ffb9-4815-8c6c-662319628915
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2275)
>   at 
> org.apache.hadoop.hive.ql.session.OperationLog$LogFile.remove(OperationLog.java:122)
>   at 
> org.apache.hadoop.hive.ql.session.OperationLog.close(OperationLog.java:90)
>   at 
> org.apache.hive.service.cli.operation.Operation.cleanupOperationLog(Operation.java:287)
>   at 
> org.apache.hive.service.cli.operation.MetadataOperation.close(MetadataOperation.java:58)
>   at 
> org.apache.hive.service.cli.operation.OperationManager.closeOperation(OperationManager.java:273)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.closeOperation(HiveSessionImpl.java:822)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>   at com.sun.proxy.$Proxy38.closeOperation(Unknown Source)
>   at 
> org.apache.hive.service.cli.CLIService.closeOperation(CLIService.java:475)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.CloseOperation(ThriftCLIService.java:671)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1677)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1662)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:605)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16902) investigate "failed to remove operation log" errors

2017-06-15 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050557#comment-16050557
 ] 

Yongzhi Chen commented on HIVE-16902:
-

The change looks fine. Just be curious, how could this exception happen when 
isRemoved is false? Is that caused by thread-safe issue?

> investigate "failed to remove operation log" errors
> ---
>
> Key: HIVE-16902
> URL: https://issues.apache.org/jira/browse/HIVE-16902
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16902.1.patch
>
>
> When we call {{set a=3;}} from beeline, the following exception is thrown. 
> {noformat}
> [HiveServer2-Handler-Pool: Thread-46]: Failed to remove corresponding log 
> file of operation: OperationHandle [opType=GET_TABLES, 
> getHandleIdentifier()=50f58d7b-f935-4590-922f-de7051a34658]
> java.io.FileNotFoundException: File does not exist: 
> /var/log/hive/operation_logs/7f613077-e29d-484a-96e1-43c81f9c0999/hive_20170531101400_28d52b7d-ffb9-4815-8c6c-662319628915
>   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2275)
>   at 
> org.apache.hadoop.hive.ql.session.OperationLog$LogFile.remove(OperationLog.java:122)
>   at 
> org.apache.hadoop.hive.ql.session.OperationLog.close(OperationLog.java:90)
>   at 
> org.apache.hive.service.cli.operation.Operation.cleanupOperationLog(Operation.java:287)
>   at 
> org.apache.hive.service.cli.operation.MetadataOperation.close(MetadataOperation.java:58)
>   at 
> org.apache.hive.service.cli.operation.OperationManager.closeOperation(OperationManager.java:273)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.closeOperation(HiveSessionImpl.java:822)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>   at com.sun.proxy.$Proxy38.closeOperation(Unknown Source)
>   at 
> org.apache.hive.service.cli.CLIService.closeOperation(CLIService.java:475)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.CloseOperation(ThriftCLIService.java:671)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1677)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1662)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:605)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16785) Ensure replication actions are idempotent if any series of events are applied again.

2017-06-15 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16785:

Attachment: HIVE-16785.02.patch

Added 02.patch with below updates.
- Added idempotent behaviour for CREATE FUNCTION and DROP FUNCTION operations.

> Ensure replication actions are idempotent if any series of events are applied 
> again.
> 
>
> Key: HIVE-16785
> URL: https://issues.apache.org/jira/browse/HIVE-16785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16785.01.patch, HIVE-16785.02.patch
>
>
> Some of the events(ALTER, RENAME, TRUNCATE) are not idempotent and hence 
> leads to failure of REPL LOAD if applied twice or applied on an object which 
> is latest than current event. For example, if TRUNCATE is applied on a table 
> which is already dropped will fail instead of noop.
> Also, need to consider the scenario where the object is missing while 
> applying an event. For example, if RENAME_TABLE event is applied on target 
> where the old table is missing should validate if table should be recreated 
> or should treat the event as noop. This can be done by verifying the DB level 
> last repl ID against the current event ID.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16785) Ensure replication actions are idempotent if any series of events are applied again.

2017-06-15 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16785:

Status: Patch Available  (was: Open)

> Ensure replication actions are idempotent if any series of events are applied 
> again.
> 
>
> Key: HIVE-16785
> URL: https://issues.apache.org/jira/browse/HIVE-16785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16785.01.patch, HIVE-16785.02.patch
>
>
> Some of the events(ALTER, RENAME, TRUNCATE) are not idempotent and hence 
> leads to failure of REPL LOAD if applied twice or applied on an object which 
> is latest than current event. For example, if TRUNCATE is applied on a table 
> which is already dropped will fail instead of noop.
> Also, need to consider the scenario where the object is missing while 
> applying an event. For example, if RENAME_TABLE event is applied on target 
> where the old table is missing should validate if table should be recreated 
> or should treat the event as noop. This can be done by verifying the DB level 
> last repl ID against the current event ID.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16731) Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null end" that involves a column name or expression THEN or ELSE vectorize

2017-06-15 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-16731:
--
Status: Patch Available  (was: Open)

> Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null 
> end" that involves a column name or expression THEN or ELSE vectorize
> ---
>
> Key: HIVE-16731
> URL: https://issues.apache.org/jira/browse/HIVE-16731
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: HIVE-16731.1.patch
>
>
> Currently, CASE WHEN statements like that become VectorUDFAdaptor expressions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16731) Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null end" that involves a column name or expression THEN or ELSE vectorize

2017-06-15 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-16731:
--
Attachment: HIVE-16731.1.patch

The existing implementation handled only one simple case. It's two of columns 
or non-null constants, such as "CASE WHEN bool1 THEN column1 ELSE value1 END". 
It's easy to translate into an equivalent IF expression, "IF(bool1, column1, 
value1)". So I made it to handle multiple WHEN clauses and an optional ELSE 
clause with a recursive method.

Null values are common in WHEN clauses, but they are excluded in vectorization 
process. So I made IfNullColumn and IfColumnNull classes to support them. It's 
hard to include the classes in VectorizedExpressions annotation in GenericUDFIf 
class, because description treats void as a member of integer family, which is 
already existing one. But it's possible to handle them in 
VectorizedContext.getIfExpression.

> Vectorization: Make "CASE WHEN (day_name='Sunday') THEN column1 ELSE null 
> end" that involves a column name or expression THEN or ELSE vectorize
> ---
>
> Key: HIVE-16731
> URL: https://issues.apache.org/jira/browse/HIVE-16731
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: HIVE-16731.1.patch
>
>
> Currently, CASE WHEN statements like that become VectorUDFAdaptor expressions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16758) Better Select Number of Replications

2017-06-15 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050520#comment-16050520
 ] 

Aihua Xu commented on HIVE-16758:
-

[~belugabehr]  Not very familiar with those configurations. Looks like 
mapreduce.client.submit.file.replication is the configuration for "The 
replication level for submitted job files". It doesn't look to be the right one 
to me. [~csun] Can you help take a look?

> Better Select Number of Replications
> 
>
> Key: HIVE-16758
> URL: https://issues.apache.org/jira/browse/HIVE-16758
> Project: Hive
>  Issue Type: Improvement
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16758.1.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.java}}
> We should be smarter about how we pick a replication number.  We should add a 
> new configuration equivalent to {{mapreduce.client.submit.file.replication}}. 
>  This value should be around the square root of the number of nodes and not 
> hard-coded in the code.
> {code}
> public static final String DFS_REPLICATION_MAX = "dfs.replication.max";
> private int minReplication = 10;
>   @Override
>   protected void initializeOp(Configuration hconf) throws HiveException {
> ...
> int dfsMaxReplication = hconf.getInt(DFS_REPLICATION_MAX, minReplication);
> // minReplication value should not cross the value of dfs.replication.max
> minReplication = Math.min(minReplication, dfsMaxReplication);
>   }
> {code}
> https://hadoop.apache.org/docs/r2.7.2/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-06-15 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-16357:
---
Affects Version/s: 3.0.0
   2.1.1

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482)
>   ... 20 more
> Caused by: java.io.FileNotFoundException: File file:/.../0 does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:429)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1555)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:574)
>   at 
> 

[jira] [Updated] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-06-15 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-16357:
---
Affects Version/s: (was: 2.1.1)
   2.3.0

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482)
>   ... 20 more
> Caused by: java.io.FileNotFoundException: File file:/.../0 does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:429)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1555)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:574)
>   at 
> 

[jira] [Updated] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-06-15 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-16357:
---
Fix Version/s: 3.0.0

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482)
>   ... 20 more
> Caused by: java.io.FileNotFoundException: File file:/.../0 does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:429)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1555)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:574)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:243)
>   at 
> 

[jira] [Updated] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-06-15 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-16357:
---
Component/s: Metastore

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482)
>   ... 20 more
> Caused by: java.io.FileNotFoundException: File file:/.../0 does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:429)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1555)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:574)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:243)
>   at 
> 

[jira] [Updated] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-06-15 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-16357:
---
Priority: Minor  (was: Major)

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Attachments: HIVE-16357.01.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482)
>   ... 20 more
> Caused by: java.io.FileNotFoundException: File file:/.../0 does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:429)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1555)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:574)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:243)
>   at 
> org.apache.hadoop.fs.ProxyFileSystem.listStatus(ProxyFileSystem.java:195)
>   at 
> 

[jira] [Updated] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-06-15 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-16357:
---
Attachment: HIVE-16357.01.patch

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Attachments: HIVE-16357.01.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482)
>   ... 20 more
> Caused by: java.io.FileNotFoundException: File file:/.../0 does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:429)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1555)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:574)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:243)
>   at 
> org.apache.hadoop.fs.ProxyFileSystem.listStatus(ProxyFileSystem.java:195)
>   at 
> 

[jira] [Updated] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-06-15 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-16357:
---
Status: Patch Available  (was: In Progress)

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Attachments: HIVE-16357.01.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482)
>   ... 20 more
> Caused by: java.io.FileNotFoundException: File file:/.../0 does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:429)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1555)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:574)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:243)
>   at 
> org.apache.hadoop.fs.ProxyFileSystem.listStatus(ProxyFileSystem.java:195)
>   

[jira] [Updated] (HIVE-16785) Ensure replication actions are idempotent if any series of events are applied again.

2017-06-15 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16785:

Fix Version/s: 3.0.0

> Ensure replication actions are idempotent if any series of events are applied 
> again.
> 
>
> Key: HIVE-16785
> URL: https://issues.apache.org/jira/browse/HIVE-16785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16785.01.patch
>
>
> Some of the events(ALTER, RENAME, TRUNCATE) are not idempotent and hence 
> leads to failure of REPL LOAD if applied twice or applied on an object which 
> is latest than current event. For example, if TRUNCATE is applied on a table 
> which is already dropped will fail instead of noop.
> Also, need to consider the scenario where the object is missing while 
> applying an event. For example, if RENAME_TABLE event is applied on target 
> where the old table is missing should validate if table should be recreated 
> or should treat the event as noop. This can be done by verifying the DB level 
> last repl ID against the current event ID.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16785) Ensure replication actions are idempotent if any series of events are applied again.

2017-06-15 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16785:

Status: Open  (was: Patch Available)

> Ensure replication actions are idempotent if any series of events are applied 
> again.
> 
>
> Key: HIVE-16785
> URL: https://issues.apache.org/jira/browse/HIVE-16785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16785.01.patch
>
>
> Some of the events(ALTER, RENAME, TRUNCATE) are not idempotent and hence 
> leads to failure of REPL LOAD if applied twice or applied on an object which 
> is latest than current event. For example, if TRUNCATE is applied on a table 
> which is already dropped will fail instead of noop.
> Also, need to consider the scenario where the object is missing while 
> applying an event. For example, if RENAME_TABLE event is applied on target 
> where the old table is missing should validate if table should be recreated 
> or should treat the event as noop. This can be done by verifying the DB level 
> last repl ID against the current event ID.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844

2017-06-15 Thread Sunitha Beeram (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050440#comment-16050440
 ] 

Sunitha Beeram commented on HIVE-16908:
---

+Watchers

> Failures in TestHcatClient due to HIVE-16844
> 
>
> Key: HIVE-16908
> URL: https://issues.apache.org/jira/browse/HIVE-16908
> Project: Hive
>  Issue Type: Bug
>Reporter: Sunitha Beeram
>Assignee: Sunitha Beeram
>
> Some of the tests in TestHCatClient.java, for ex:
> {noformat}
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
> (batchId=177)
> {noformat}
> are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new 
> configuration object is set on the ObjectStore. TestHCatClient fires up a 
> second instance of metastore thread with a different conf object that results 
> in the PersistenceMangaerFactory closure and hence tests fail. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844

2017-06-15 Thread Sunitha Beeram (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunitha Beeram reassigned HIVE-16908:
-


> Failures in TestHcatClient due to HIVE-16844
> 
>
> Key: HIVE-16908
> URL: https://issues.apache.org/jira/browse/HIVE-16908
> Project: Hive
>  Issue Type: Bug
>Reporter: Sunitha Beeram
>Assignee: Sunitha Beeram
>
> Some of the tests in TestHCatClient.java, for ex:
> {noformat}
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
> (batchId=177)
> {noformat}
> are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new 
> configuration object is set on the ObjectStore. TestHCatClient fires up a 
> second instance of metastore thread with a different conf object that results 
> in the PersistenceMangaerFactory closure and hence tests fail. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16885) Non-equi Joins: Filter clauses should be pushed into the ON clause

2017-06-15 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050319#comment-16050319
 ] 

Jesus Camacho Rodriguez commented on HIVE-16885:


[~ashutoshc], could you take a look?
https://reviews.apache.org/r/60116/

Thanks

> Non-equi Joins: Filter clauses should be pushed into the ON clause
> --
>
> Key: HIVE-16885
> URL: https://issues.apache.org/jira/browse/HIVE-16885
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16885.01.patch, HIVE-16885.patch
>
>
> FIL_24 -> MAPJOIN_23
> {code}
> hive> explain  select * from part where p_size > (select max(p_size) from 
> part group by p_type);
> Warning: Map Join MAPJOIN[14][bigTable=?] in task 'Map 1' is a cross product
> OK
> Plan optimized by CBO.
> Vertex dependency in root stage
> Map 1 <- Reducer 3 (BROADCAST_EDGE)
> Reducer 3 <- Map 2 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Map 1 vectorized, llap
>   File Output Operator [FS_26]
> Select Operator [SEL_25] (rows=110 width=621)
>   
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>   Filter Operator [FIL_24] (rows=110 width=625)
> predicate:(_col5 > _col9)
> Map Join Operator [MAPJOIN_23] (rows=330 width=625)
>   
> Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9"]
> <-Reducer 3 [BROADCAST_EDGE] vectorized, llap
>   BROADCAST [RS_21]
> Select Operator [SEL_20] (rows=165 width=4)
>   Output:["_col0"]
>   Group By Operator [GBY_19] (rows=165 width=109)
> 
> Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0
>   <-Map 2 [SIMPLE_EDGE] vectorized, llap
> SHUFFLE [RS_18]
>   PartitionCols:_col0
>   Group By Operator [GBY_17] (rows=14190 width=109)
> 
> Output:["_col0","_col1"],aggregations:["max(p_size)"],keys:p_type
> Select Operator [SEL_16] (rows=2 width=109)
>   Output:["p_type","p_size"]
>   TableScan [TS_2] (rows=2 width=109)
> 
> tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"]
> <-Select Operator [SEL_22] (rows=2 width=621)
> 
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
> TableScan [TS_0] (rows=2 width=621)
>   
> tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_partkey","p_name","p_mfgr","p_brand","p_type","p_size","p_container","p_retailprice","p_comment"]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote

2017-06-15 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050173#comment-16050173
 ] 

Nemon Lou commented on HIVE-16907:
--

Refer to this method :
https://github.com/apache/hive/blob/release-2.0.0/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java#L322
 tdb.t1 is treated as a table name.
--> 'tdb.tdb.t1' is putted in insertIntoTables of QBParseInfo 
--> QBParseInfo.isInsertIntoTable('tdb.t1') returns false
-->LoadTableDesc.setReplace(!qb.getParseInfo().isInsertIntoTable(dest_tab.getDbName(),
dest_tab.getTableName()))  setting replace to true.



>  "INSERT INTO"  overwrite old data when destination table encapsulated by 
> backquote 
> 
>
> Key: HIVE-16907
> URL: https://issues.apache.org/jira/browse/HIVE-16907
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.1.0, 2.1.1
>Reporter: Nemon Lou
>
> A way to reproduce:
> {noformat}
> create database tdb;
> use tdb;
> create table t1(id int);
> create table t2(id int);
> explain insert into `tdb.t1` select * from t2;
> {noformat}
> {noformat}
> +---+
> |  
> Explain  |
> +---+
> | STAGE DEPENDENCIES: 
>   |
> |   Stage-1 is a root stage   
>   |
> |   Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, 
> Stage-4  |
> |   Stage-3   
>   |
> |   Stage-0 depends on stages: Stage-3, Stage-2, Stage-5  
>   |
> |   Stage-2   
>   |
> |   Stage-4   
>   |
> |   Stage-5 depends on stages: Stage-4
>   |
> | 
>   |
> | STAGE PLANS:
>   |
> |   Stage: Stage-1
>   |
> | Map Reduce  
>   |
> |   Map Operator Tree:
>   |
> |   TableScan 
>   |
> | alias: t2   
>   |
> | Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE |
> | Select Operator 
>   |
> |   expressions: id (type: int)   
>   |
> |   outputColumnNames: _col0  
>   |
> |   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE   |
> |   File Output Operator  
>   

[jira] [Updated] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]

2017-06-15 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated HIVE-11297:

Attachment: HIVE-11297.4.patch

> Combine op trees for partition info generating tasks [Spark branch]
> ---
>
> Key: HIVE-11297
> URL: https://issues.apache.org/jira/browse/HIVE-11297
> Project: Hive
>  Issue Type: Bug
>Affects Versions: spark-branch
>Reporter: Chao Sun
>Assignee: liyunzhang_intel
> Attachments: HIVE-11297.1.patch, HIVE-11297.2.patch, 
> HIVE-11297.3.patch, HIVE-11297.4.patch
>
>
> Currently, for dynamic partition pruning in Spark, if a small table generates 
> partition info for more than one partition columns, multiple operator trees 
> are created, which all start from the same table scan op, but have different 
> spark partition pruning sinks.
> As an optimization, we can combine these op trees and so don't have to do 
> table scan multiple times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]

2017-06-15 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated HIVE-11297:

Attachment: (was: HIVE-11297.4.patch)

> Combine op trees for partition info generating tasks [Spark branch]
> ---
>
> Key: HIVE-11297
> URL: https://issues.apache.org/jira/browse/HIVE-11297
> Project: Hive
>  Issue Type: Bug
>Affects Versions: spark-branch
>Reporter: Chao Sun
>Assignee: liyunzhang_intel
> Attachments: HIVE-11297.1.patch, HIVE-11297.2.patch, 
> HIVE-11297.3.patch
>
>
> Currently, for dynamic partition pruning in Spark, if a small table generates 
> partition info for more than one partition columns, multiple operator trees 
> are created, which all start from the same table scan op, but have different 
> spark partition pruning sinks.
> As an optimization, we can combine these op trees and so don't have to do 
> table scan multiple times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]

2017-06-15 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated HIVE-11297:

Attachment: HIVE-11297.4.patch

[~csun]: update HIVE-11297.4.patch according to what you mentioned on RB.
{noformat}
 TS1TS2
  |   |
  FIL1FIL2
  |   |
  RS SEL---
  |  |   \\
  |RS  SEL  SEL
  \   /  | |
  JOIN  GBY   GBY
  ||
  |  SPARKPRUNINGSINK
  |
  SPARKPRUNINGSINK
{noformat}
current algorithms:
1. find the filter FIL2, tranverse each branch of FIL2 and get the children 
which start branches contain SPARKPRUNINGSINK.
2.  split the tree into 2 seperate tree

> Combine op trees for partition info generating tasks [Spark branch]
> ---
>
> Key: HIVE-11297
> URL: https://issues.apache.org/jira/browse/HIVE-11297
> Project: Hive
>  Issue Type: Bug
>Affects Versions: spark-branch
>Reporter: Chao Sun
>Assignee: liyunzhang_intel
> Attachments: HIVE-11297.1.patch, HIVE-11297.2.patch, 
> HIVE-11297.3.patch
>
>
> Currently, for dynamic partition pruning in Spark, if a small table generates 
> partition info for more than one partition columns, multiple operator trees 
> are created, which all start from the same table scan op, but have different 
> spark partition pruning sinks.
> As an optimization, we can combine these op trees and so don't have to do 
> table scan multiple times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote

2017-06-15 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050146#comment-16050146
 ] 

Nemon Lou commented on HIVE-16907:
--

AST with backquote:
{noformat}
| TOK_QUERY 


 |
|TOK_FROM   


 |
|   TOK_TABREF  


 |
|  TOK_TABNAME  


 |
| t2


 |
|TOK_INSERT 


 |
|   TOK_INSERT_INTO 


 |
|  TOK_TAB  


 |
| TOK_TABNAME   


 |
|tdb.t1 


 |
|   TOK_SELECT  


 |
|  TOK_SELEXPR  


 |
| TOK_ALLCOLREF  
{noformat}

AST without backquote:
{noformat}
|
| TOK_QUERY 


 |
|TOK_FROM   


 |
|   TOK_TABREF  

   

[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-06-15 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050122#comment-16050122
 ] 

anishek commented on HIVE-16886:


http://www.datanucleus.org/products/accessplatform_4_1/jdo/versioning.html



> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-06-15 Thread Alexander Kolbasov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050065#comment-16050065
 ] 

Alexander Kolbasov commented on HIVE-16886:
---

Can you clarify what versioning scheme you have in mind?

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2017-06-15 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050043#comment-16050043
 ] 

anishek commented on HIVE-16886:


Another mode to achieve the same would be the way hibernate versioning works, 
there seems to be a jdo versioning available which looks the same. 

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Reporter: Sergio Peña
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16905) Add zookeeper ACL for hiveserver2

2017-06-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050042#comment-16050042
 ] 

Thejas M Nair commented on HIVE-16905:
--

[~txhsj]
Do you have any use cases in mind where current defaults are not suitable ?


> Add zookeeper ACL for hiveserver2
> -
>
> Key: HIVE-16905
> URL: https://issues.apache.org/jira/browse/HIVE-16905
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
> Attachments: HIVE-16905.1.patch
>
>
> Add zookeeper ACL for hiveserver2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16901) Distcp optimization - One distcp per ReplCopyTask

2017-06-15 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16901:

Description: 
Currently, if a ReplCopyTask is created to copy a list of files, then distcp is 
invoked for each and every file. Instead, need to pass the list of source files 
to be copied to distcp tool which basically copies the files in parallel and 
hence gets lot of performance gain.

If the copy of list of files fail, then traverse the destination directory to 
see which file is missing and checksum mismatches, then trigger copy of those 
files one by one.

  was:
Currently, if a CopyTask is created to copy a list of files, then distcp is 
invoked for each and every file. Instead, need to pass the list of source files 
to be copied to distcp tool which basically copies the files in parallel and 
hence gets lot of performance gain.

If the copy of list of files fail, then traverse the destination directory to 
see which file is missing and checksum mismatches, then trigger copy of those 
files one by one.


> Distcp optimization - One distcp per ReplCopyTask 
> --
>
> Key: HIVE-16901
> URL: https://issues.apache.org/jira/browse/HIVE-16901
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> Currently, if a ReplCopyTask is created to copy a list of files, then distcp 
> is invoked for each and every file. Instead, need to pass the list of source 
> files to be copied to distcp tool which basically copies the files in 
> parallel and hence gets lot of performance gain.
> If the copy of list of files fail, then traverse the destination directory to 
> see which file is missing and checksum mismatches, then trigger copy of those 
> files one by one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16901) Distcp optimization - One distcp per ReplCopyTask

2017-06-15 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16901:

Summary: Distcp optimization - One distcp per ReplCopyTask   (was: Distcp 
optimization - One distcp per CopyTask )

> Distcp optimization - One distcp per ReplCopyTask 
> --
>
> Key: HIVE-16901
> URL: https://issues.apache.org/jira/browse/HIVE-16901
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> Currently, if a CopyTask is created to copy a list of files, then distcp is 
> invoked for each and every file. Instead, need to pass the list of source 
> files to be copied to distcp tool which basically copies the files in 
> parallel and hence gets lot of performance gain.
> If the copy of list of files fail, then traverse the destination directory to 
> see which file is missing and checksum mismatches, then trigger copy of those 
> files one by one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-16901) Distcp optimization - One distcp per ReplCopyTask

2017-06-15 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-16901 started by Sankar Hariappan.
---
> Distcp optimization - One distcp per ReplCopyTask 
> --
>
> Key: HIVE-16901
> URL: https://issues.apache.org/jira/browse/HIVE-16901
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
>
> Currently, if a CopyTask is created to copy a list of files, then distcp is 
> invoked for each and every file. Instead, need to pass the list of source 
> files to be copied to distcp tool which basically copies the files in 
> parallel and hence gets lot of performance gain.
> If the copy of list of files fail, then traverse the destination directory to 
> see which file is missing and checksum mismatches, then trigger copy of those 
> files one by one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16905) Add zookeeper ACL for hiveserver2

2017-06-15 Thread Saijin Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saijin Huang updated HIVE-16905:

Status: Patch Available  (was: Open)

> Add zookeeper ACL for hiveserver2
> -
>
> Key: HIVE-16905
> URL: https://issues.apache.org/jira/browse/HIVE-16905
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
> Attachments: HIVE-16905.1.patch
>
>
> Add zookeeper ACL for hiveserver2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16905) Add zookeeper ACL for hiveserver2

2017-06-15 Thread Saijin Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saijin Huang updated HIVE-16905:

Attachment: HIVE-16905.1.patch

> Add zookeeper ACL for hiveserver2
> -
>
> Key: HIVE-16905
> URL: https://issues.apache.org/jira/browse/HIVE-16905
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
> Attachments: HIVE-16905.1.patch
>
>
> Add zookeeper ACL for hiveserver2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)