from:"ASF GitHub Bot \(JIRA\)"

[jira] [Updated] (HIVE-19886) Logs may be directed to 2 files if --hiveconf hive.log.file is used

2018-06-18 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19886:
--
Labels: pull-request-available  (was: )

> Logs may be directed to 2 files if --hiveconf hive.log.file is used
> ---
>
> Key: HIVE-19886
> URL: https://issues.apache.org/jira/browse/HIVE-19886
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
>
> hive launch script explicitly specific log4j2 configuration file to use. The 
> main() methods in HiveServer2 and HiveMetastore reconfigures the logger based 
> on user input via --hiveconf hive.log.file. This may cause logs to end up in 
> 2 different files. Initial logs goes to the file specified in 
> hive-log4j2.properties and after logger reconfiguration the rest of the logs 
> goes to the file specified via --hiveconf hive.log.file. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19831) Hiveserver2 should skip doAuth checks for CREATE DATABASE/TABLE if database/table already exists

2018-06-12 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19831:
--
Labels: pull-request-available  (was: )

> Hiveserver2 should skip doAuth checks for CREATE DATABASE/TABLE if 
> database/table already exists
> 
>
> Key: HIVE-19831
> URL: https://issues.apache.org/jira/browse/HIVE-19831
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Rajkumar Singh
>Priority: Minor
>  Labels: pull-request-available
>
> with sqlstdauth on, Create database if exists take TOO LONG if there are too 
> many objects inside the database directory. Hive should not run the doAuth 
> checks for all the objects within database if the database already exists.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19831) Hiveserver2 should skip doAuth checks for CREATE DATABASE/TABLE if database/table already exists

2018-06-12 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510293#comment-16510293
 ] 

ASF GitHub Bot commented on HIVE-19831:
---

GitHub user rajkrrsingh opened a pull request:

https://github.com/apache/hive/pull/372

HIVE-19831: Hiveserver2 should skip doAuth checks for CREATE DATABASE…

Hiveserver2 should skip doAuth checks for CREATE DATABASE/TABLE if 
database/table already exists.
the proposed change will skip the authorization check if the database is 
already exists.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajkrrsingh/hive HIVE-19831

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/372.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #372


commit f62c4bbaf9d3bdfd762492fc3fc49772ce8b625a
Author: Rajkumar singh 
Date:   2018-06-12T22:02:41Z

HIVE-19831: Hiveserver2 should skip doAuth checks for CREATE DATABASE/TABLE 
if database/table already exists




> Hiveserver2 should skip doAuth checks for CREATE DATABASE/TABLE if 
> database/table already exists
> 
>
> Key: HIVE-19831
> URL: https://issues.apache.org/jira/browse/HIVE-19831
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Rajkumar Singh
>Priority: Minor
>  Labels: pull-request-available
>
> with sqlstdauth on, Create database if exists take TOO LONG if there are too 
> many objects inside the database directory. Hive should not run the doAuth 
> checks for all the objects within database if the database already exists.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16520) Cache hive metadata in metastore

2018-06-12 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-16520:
--
Labels: TODOC3.0 pull-request-available  (was: TODOC3.0)

> Cache hive metadata in metastore
> 
>
> Key: HIVE-16520
> URL: https://issues.apache.org/jira/browse/HIVE-16520
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
>  Labels: TODOC3.0, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-16520-1.patch, HIVE-16520-proto-2.patch, 
> HIVE-16520-proto.patch, HIVE-16520.2.patch, HIVE-16520.3.patch, 
> HIVE-16520.4.patch
>
>
> During Hive 2 benchmark, we find Hive metastore operation take a lot of time 
> and thus slow down Hive compilation. In some extreme case, it takes much 
> longer than the actual query run time. Especially, we find the latency of 
> cloud db is very high and 90% of total query runtime is waiting for metastore 
> SQL database operations. Based on this observation, the metastore operation 
> performance will be greatly enhanced if we have a memory structure which 
> cache the database query result.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16520) Cache hive metadata in metastore

2018-06-12 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510342#comment-16510342
 ] 

ASF GitHub Bot commented on HIVE-16520:
---

Github user daijyc closed the pull request at:

https://github.com/apache/hive/pull/173


> Cache hive metadata in metastore
> 
>
> Key: HIVE-16520
> URL: https://issues.apache.org/jira/browse/HIVE-16520
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
>  Labels: TODOC3.0, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-16520-1.patch, HIVE-16520-proto-2.patch, 
> HIVE-16520-proto.patch, HIVE-16520.2.patch, HIVE-16520.3.patch, 
> HIVE-16520.4.patch
>
>
> During Hive 2 benchmark, we find Hive metastore operation take a lot of time 
> and thus slow down Hive compilation. In some extreme case, it takes much 
> longer than the actual query run time. Especially, we find the latency of 
> cloud db is very high and 90% of total query runtime is waiting for metastore 
> SQL database operations. Based on this observation, the metastore operation 
> performance will be greatly enhanced if we have a memory structure which 
> cache the database query result.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19880) Repl Load to return recoverable vs non-recoverable error codes

2018-06-13 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511389#comment-16511389
 ] 

ASF GitHub Bot commented on HIVE-19880:
---

GitHub user maheshk114 opened a pull request:

https://github.com/apache/hive/pull/374

HIVE-19880 : Repl Load to return recoverable vs non-recoverable error…

… 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maheshk114/hive BUG-103748

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/374.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #374


commit baaed4a0a3ac4cefe4e8069710bc32a6e3f2c59a
Author: Mahesh Kumar Behera 
Date:   2018-06-13T13:19:44Z

HIVE-19880 : Repl Load to return recoverable vs non-recoverable error codes




> Repl Load to return recoverable vs non-recoverable error codes
> --
>
> Key: HIVE-19880
> URL: https://issues.apache.org/jira/browse/HIVE-19880
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19880.01.patch
>
>
> To enable bootstrap of large databases, application has to have the ability 
> to keep retrying the bootstrap load till it encounters a fatal error. The 
> ability to identify if an error is fatal or not will be decided by hive and 
> communication of the same will happen to application via error codes.
> So there should be different error codes for recoverable vs non-recoverable 
> failures which should be propagated to application as part of running the 
> repl load command.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19881) Allow metadata dump for database which are not source of replication

2018-06-13 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511387#comment-16511387
 ] 

ASF GitHub Bot commented on HIVE-19881:
---

GitHub user maheshk114 opened a pull request:

https://github.com/apache/hive/pull/373

HIVE-19881 : Allow metadata dump for database which are not source of 
replication

… 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maheshk114/hive BUG-105280

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/373.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #373


commit 36f6cd4b56bf36a1f5217406bd62de03ab338dda
Author: Mahesh Kumar Behera 
Date:   2018-06-13T15:33:33Z

HIVE-19881 : Allow metadata dump for database which are not source of 
replication




> Allow metadata dump for database which are not source of replication
> 
>
> Key: HIVE-19881
> URL: https://issues.apache.org/jira/browse/HIVE-19881
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19881.01..patch
>
>
> If the dump is meta data only then allow dump even if the db is not source of 
> replication



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19880) Repl Load to return recoverable vs non-recoverable error codes

2018-06-13 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19880:
--
Labels: pull-request-available  (was: )

> Repl Load to return recoverable vs non-recoverable error codes
> --
>
> Key: HIVE-19880
> URL: https://issues.apache.org/jira/browse/HIVE-19880
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19880.01.patch
>
>
> To enable bootstrap of large databases, application has to have the ability 
> to keep retrying the bootstrap load till it encounters a fatal error. The 
> ability to identify if an error is fatal or not will be decided by hive and 
> communication of the same will happen to application via error codes.
> So there should be different error codes for recoverable vs non-recoverable 
> failures which should be propagated to application as part of running the 
> repl load command.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19881) Allow metadata dump for database which are not source of replication

2018-06-13 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19881:
--
Labels: pull-request-available  (was: )

> Allow metadata dump for database which are not source of replication
> 
>
> Key: HIVE-19881
> URL: https://issues.apache.org/jira/browse/HIVE-19881
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19881.01..patch
>
>
> If the dump is meta data only then allow dump even if the db is not source of 
> replication



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19739) Bootstrap REPL LOAD to use checkpoints to validate and skip the loaded data/metadata.

2018-06-14 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512614#comment-16512614
 ] 

ASF GitHub Bot commented on HIVE-19739:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/366


> Bootstrap REPL LOAD to use checkpoints to validate and skip the loaded 
> data/metadata.
> -
>
> Key: HIVE-19739
> URL: https://issues.apache.org/jira/browse/HIVE-19739
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Fix For: 4.0.0
>
> Attachments: HIVE-19739.01-branch-3.patch, HIVE-19739.01.patch, 
> HIVE-19739.02.patch, HIVE-19739.03.patch, HIVE-19739.04.patch
>
>
> Currently. bootstrap REPL LOAD have added checkpoint identifiers in 
> DB/table/partition object properties once the data/metadata related to the 
> object is successfully loaded.
> If the Db exist and is not empty, then currently we are throwing exception. 
> But need to support it for the retry scenario after a failure.
> If there is a retry of bootstrap load using the same dump, then instead of 
> throwing error, we should check if any of the tables/partitions are 
> completely loaded using the checkpoint identifiers. If yes, then skip it or 
> else drop/create them again.
> If the bootstrap load is performed using different dump, then it should throw 
> exception.
> Allow bootstrap on empty Db only if ckpt property is not set. Also, if 
> bootstrap load is completed on the target Db, then shouldn't allow bootstrap 
> retry at all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19723) Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"

2018-06-10 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507617#comment-16507617
 ] 

ASF GitHub Bot commented on HIVE-19723:
---

Github user pudidic closed the pull request at:

https://github.com/apache/hive/pull/369


> Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"
> -
>
> Key: HIVE-19723
> URL: https://issues.apache.org/jira/browse/HIVE-19723
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-19723.1.patch, HIVE-19723.3.patch, 
> HIVE-19723.4.patch, HIVE-19732.2.patch
>
>
> Spark's Arrow support only provides Timestamp at MICROSECOND granularity. 
> Spark 2.3.0 won't accept NANOSECOND. Switch it back to MICROSECOND.
> The unit test org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow will just need 
> to change the assertion to test microsecond. And we'll need to add this to 
> documentation on supported datatypes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19815) Repl dump should not propagate the checkpoint and repl source properties

2018-06-10 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507694#comment-16507694
 ] 

ASF GitHub Bot commented on HIVE-19815:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/367


> Repl dump should not propagate the checkpoint and repl source properties
> 
>
> Key: HIVE-19815
> URL: https://issues.apache.org/jira/browse/HIVE-19815
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19815.01.patch, HIVE-19815.02.patch
>
>
> For replication scenarios of A-> B -> C the repl dump on B should not include 
> the checkpoint property when dumping out table information. 
> Alter tables/partitions during incremental should not propagate this as well.
> Also should not propagate the the db level parameters set by replication 
> internally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19853) Arrow serializer needs to create a TimeStampMicroTZVector instead of TimeStampMicroVector

2018-06-11 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507849#comment-16507849
 ] 

ASF GitHub Bot commented on HIVE-19853:
---

GitHub user pudidic opened a pull request:

https://github.com/apache/hive/pull/371

HIVE-19853: Arrow serializer needs to create a TimeStampMicroTZVector…

… instead of TimeStampMicroVector

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pudidic/hive HIVE-19853

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/371.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #371


commit f785b6d5603d94b126a9611b4a583e4803dd54f7
Author: Teddy Choi 
Date:   2018-06-11T09:29:57Z

HIVE-19853: Arrow serializer needs to create a TimeStampMicroTZVector 
instead of TimeStampMicroVector




> Arrow serializer needs to create a TimeStampMicroTZVector instead of 
> TimeStampMicroVector
> -
>
> Key: HIVE-19853
> URL: https://issues.apache.org/jira/browse/HIVE-19853
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19853.1.patch
>
>
> HIVE-19723 changed nanosecond to microsecond in Arrow serialization. However, 
> it needs to be microsecond with time zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19853) Arrow serializer needs to create a TimeStampMicroTZVector instead of TimeStampMicroVector

2018-06-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19853:
--
Labels: pull-request-available  (was: )

> Arrow serializer needs to create a TimeStampMicroTZVector instead of 
> TimeStampMicroVector
> -
>
> Key: HIVE-19853
> URL: https://issues.apache.org/jira/browse/HIVE-19853
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19853.1.patch
>
>
> HIVE-19723 changed nanosecond to microsecond in Arrow serialization. However, 
> it needs to be microsecond with time zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19725) Add ability to dump non-native tables in replication metadata dump

2018-05-28 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19725:
--
Labels: Repl pull-request-available  (was: Repl)

> Add ability to dump non-native tables in replication metadata dump
> --
>
> Key: HIVE-19725
> URL: https://issues.apache.org/jira/browse/HIVE-19725
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.0.0, 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: Repl, pull-request-available
> Fix For: 3.1.0, 3.0.1, 4.0.0
>
> Attachments: HIVE-19725.01.patch
>
>
> if hive.repl.dump.metadata.only is set to true, allow dumping non native 
> tables also. This will be used by DAS.
> Data dump for non-native tables should never be allowed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19725) Add ability to dump non-native tables in replication metadata dump

2018-05-28 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493102#comment-16493102
 ] 

ASF GitHub Bot commented on HIVE-19725:
---

GitHub user maheshk114 opened a pull request:

https://github.com/apache/hive/pull/361

HIVE-19725 : Add ability to dump non-native tables in replication metadata 
dump

if hive.repl.dump.metadata.only is set to true, allow dumping non native 
tables also. This will be used by DAS.

Data dump for non-native tables should never be allowed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maheshk114/hive BUG-103509

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/361.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #361


commit ecb7e9637660f69e1503bd8311359b0bb4bad543
Author: Mahesh Kumar Behera 
Date:   2018-05-29T04:43:52Z

HIVE-19725 : Add ability to dump non-native tables in replication metadata 
dump




> Add ability to dump non-native tables in replication metadata dump
> --
>
> Key: HIVE-19725
> URL: https://issues.apache.org/jira/browse/HIVE-19725
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.0.0, 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: Repl, pull-request-available
> Fix For: 3.1.0, 3.0.1, 4.0.0
>
> Attachments: HIVE-19725.01.patch
>
>
> if hive.repl.dump.metadata.only is set to true, allow dumping non native 
> tables also. This will be used by DAS.
> Data dump for non-native tables should never be allowed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19499) Bootstrap REPL LOAD shall add tasks to create checkpoints for db/tables/partitions.

2018-05-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491724#comment-16491724
 ] 

ASF GitHub Bot commented on HIVE-19499:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/352


> Bootstrap REPL LOAD shall add tasks to create checkpoints for 
> db/tables/partitions.
> ---
>
> Key: HIVE-19499
> URL: https://issues.apache.org/jira/browse/HIVE-19499
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-19499.01.patch, HIVE-19499.02.patch
>
>
> Currently. bootstrap REPL LOAD expect the target database to be empty or not 
> exist to start bootstrap load.
> But, this adds overhead when there is a failure in between bootstrap load and 
> there is no way to resume it from where it fails. So, it is needed to create 
> checkpoints in table/partitions to skip the completely loaded objects.
> Use the fully qualified path of the dump directory as a checkpoint 
> identifier. This should be added to the table / partition properties in hive 
> via a task, as the last task in the DAG for table / partition creation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19661) switch Hive UDFs to use Re2J regex engine

2018-05-29 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494371#comment-16494371
 ] 

ASF GitHub Bot commented on HIVE-19661:
---

Github user rajkrrsingh closed the pull request at:

https://github.com/apache/hive/pull/358


> switch Hive UDFs to use Re2J regex engine
> -
>
> Key: HIVE-19661
> URL: https://issues.apache.org/jira/browse/HIVE-19661
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19661.patch
>
>
> Java regex engine can be very slow in some cases e.g. 
> https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19661) switch Hive UDFs to use Re2J regex engine

2018-05-29 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494377#comment-16494377
 ] 

ASF GitHub Bot commented on HIVE-19661:
---

GitHub user rajkrrsingh opened a pull request:

https://github.com/apache/hive/pull/362

HIVE-19661: switch Hive UDFs to use Re2J regex engine.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajkrrsingh/hive HIVE-19661

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/362.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #362


commit e3280ec23c7ec4a4a69197a776e8cc1b32c53630
Author: Rajkumar singh 
Date:   2018-05-29T22:14:51Z

HIVE-19661: switch Hive UDFs to use Re2J regex engine.




> switch Hive UDFs to use Re2J regex engine
> -
>
> Key: HIVE-19661
> URL: https://issues.apache.org/jira/browse/HIVE-19661
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19661.patch
>
>
> Java regex engine can be very slow in some cases e.g. 
> https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19661) switch Hive UDFs to use Re2J regex engine

2018-05-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19661:
--
Labels: pull-request-available  (was: )

> switch Hive UDFs to use Re2J regex engine
> -
>
> Key: HIVE-19661
> URL: https://issues.apache.org/jira/browse/HIVE-19661
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
>
> Java regex engine can be very slow in some cases e.g. 
> https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19661) switch Hive UDFs to use Re2J regex engine

2018-05-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491923#comment-16491923
 ] 

ASF GitHub Bot commented on HIVE-19661:
---

GitHub user rajkrrsingh opened a pull request:

https://github.com/apache/hive/pull/358

HIVE-19661 : switch Hive UDFs to use Re2J regex engine



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajkrrsingh/hive HIVE-19661

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/358.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #358


commit 41cff8d9c5d4ffed7d75388013a2618e787ef7fc
Author: Rajkumar singh 
Date:   2018-05-27T05:18:04Z

HIVE-19661 : switch Hive UDFs to use Re2J regex engine

commit da1a73b40896f84b920db3c90212fa3bbf375a95
Author: Rajkumar singh 
Date:   2018-05-27T05:19:35Z

HIVE-19661 : switch Hive UDFs to use Re2J regex engine




> switch Hive UDFs to use Re2J regex engine
> -
>
> Key: HIVE-19661
> URL: https://issues.apache.org/jira/browse/HIVE-19661
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
>
> Java regex engine can be very slow in some cases e.g. 
> https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19776) HiveServer2.startHiveServer2 retries of start has concurrency issues

2018-06-02 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19776:
--
Labels: pull-request-available  (was: )

> HiveServer2.startHiveServer2 retries of start has concurrency issues
> 
>
> Key: HIVE-19776
> URL: https://issues.apache.org/jira/browse/HIVE-19776
> Project: Hive
>  Issue Type: Improvement
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Major
>  Labels: pull-request-available
>
> HS2 starts the thrift binary/http servers in background, while it proceeds to 
> do other setup (eg create zookeeper entries). If there is a ZK error and it 
> attempts to stop and start in the retry loop within 
> HiveServer2.startHiveServer2, the retry fails because the thrift server 
> doesn't get stopped if it was still getting initialized.
> The thrift server initialization and stopping needs to be synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19776) HiveServer2.startHiveServer2 retries of start has concurrency issues

2018-06-02 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16499205#comment-16499205
 ] 

ASF GitHub Bot commented on HIVE-19776:
---

GitHub user thejasmn opened a pull request:

https://github.com/apache/hive/pull/363

HIVE-19776 1.patch



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/thejasmn/hive HIVE-19776

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/363.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #363


commit 85c7505d3719415fb39707c43053fcf19f4fe838
Author: Thejas M Nair 
Date:   2018-06-02T19:46:08Z

HIVE-19776 1.patch




> HiveServer2.startHiveServer2 retries of start has concurrency issues
> 
>
> Key: HIVE-19776
> URL: https://issues.apache.org/jira/browse/HIVE-19776
> Project: Hive
>  Issue Type: Improvement
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Major
>  Labels: pull-request-available
>
> HS2 starts the thrift binary/http servers in background, while it proceeds to 
> do other setup (eg create zookeeper entries). If there is a ZK error and it 
> attempts to stop and start in the retry loop within 
> HiveServer2.startHiveServer2, the retry fails because the thrift server 
> doesn't get stopped if it was still getting initialized.
> The thrift server initialization and stopping needs to be synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)

2018-06-05 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501408#comment-16501408
 ] 

ASF GitHub Bot commented on HIVE-16391:
---

GitHub user jerryshao opened a pull request:

https://github.com/apache/hive/pull/364

HIVE-16391: Add a new classifier for hive-exec to be used by Spark

This fix adding a new classifier for hive-exec artifact (`core-spark`), 
which is specifically used for Spark. Details in 
[SPARK-20202](https://issues.apache.org/jira/browse/SPARK-20202). 

This is because  original hive-exec packages many transitive dependencies 
into shaded jar without relocation, this makes conflicts in Spark. Spark only 
needs to relocate protobuf and kryo jar. So here propose to add a new 
classifier to generate a new artifact only for Spark.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jerryshao/hive 1.2-spark-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/364.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #364


commit bb27b260d82fa0a77d9fea3c123f2af8f1ea88aa
Author: jerryshao 
Date:   2018-06-05T06:59:37Z

HIVE-16391: Add a new classifier for hive-exec to be used by Spark




> Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
> -
>
> Key: HIVE-16391
> URL: https://issues.apache.org/jira/browse/HIVE-16391
> Project: Hive
>  Issue Type: Task
>  Components: Build Infrastructure
>Reporter: Reynold Xin
>Priority: Major
>  Labels: pull-request-available
>
> Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the 
> only change in the fork is to work around the issue that Hive publishes only 
> two sets of jars: one set with no dependency declared, and another with all 
> the dependencies included in the published uber jar. That is to say, Hive 
> doesn't publish a set of jars with the proper dependencies declared.
> There is general consensus on both sides that we should remove the forked 
> Hive.
> The change in the forked version is recorded here 
> https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2
> Note that the fork in the past included other fixes but those have all become 
> unnecessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)

2018-06-05 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-16391:
--
Labels: pull-request-available  (was: )

> Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
> -
>
> Key: HIVE-16391
> URL: https://issues.apache.org/jira/browse/HIVE-16391
> Project: Hive
>  Issue Type: Task
>  Components: Build Infrastructure
>Reporter: Reynold Xin
>Priority: Major
>  Labels: pull-request-available
>
> Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the 
> only change in the fork is to work around the issue that Hive publishes only 
> two sets of jars: one set with no dependency declared, and another with all 
> the dependencies included in the published uber jar. That is to say, Hive 
> doesn't publish a set of jars with the proper dependencies declared.
> There is general consensus on both sides that we should remove the forked 
> Hive.
> The change in the forked version is recorded here 
> https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2
> Note that the fork in the past included other fixes but those have all become 
> unnecessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19812) Disable external table replication by default via a configuration property

2018-06-06 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502940#comment-16502940
 ] 

ASF GitHub Bot commented on HIVE-19812:
---

GitHub user maheshk114 opened a pull request:

https://github.com/apache/hive/pull/365

HIVE-19812 : Disable external table replication by default via a 
coniguration property

use a hive config property to allow external table replication. set this 
property by default to prevent external table replication.

for metadata only hive repl always export metadata for external tables.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maheshk114/hive BUG-104223

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/365.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #365


commit 45ef2edd4a82276e16e2a60d49e655210f3b4c21
Author: Mahesh Kumar Behera 
Date:   2018-06-06T04:05:28Z

HIVE-19812 : Disable external table replication by default via a 
configuration property




> Disable external table replication by default via a configuration property
> --
>
> Key: HIVE-19812
> URL: https://issues.apache.org/jira/browse/HIVE-19812
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
>
> use a hive config property to allow external table replication. set this 
> property by default to prevent external table replication.
> for metadata only hive repl always export metadata for external tables.
>  
> REPL_DUMP_EXTERNAL_TABLES("hive.repl.dump.include.external.tables", false,
> "Indicates if repl dump should include information about external tables. It 
> should be \n"
> + "used in conjunction with 'hive.repl.dump.metadata.only' set to false. if 
> 'hive.repl.dump.metadata.only' \n"
> + " is set to true then this config parameter has no effect as external table 
> meta data is flushed \n"
> + " always by default.")
> This should be done for only replication dump and not for export



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19812) Disable external table replication by default via a configuration property

2018-06-06 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19812:
--
Labels: pull-request-available  (was: )

> Disable external table replication by default via a configuration property
> --
>
> Key: HIVE-19812
> URL: https://issues.apache.org/jira/browse/HIVE-19812
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
>
> use a hive config property to allow external table replication. set this 
> property by default to prevent external table replication.
> for metadata only hive repl always export metadata for external tables.
>  
> REPL_DUMP_EXTERNAL_TABLES("hive.repl.dump.include.external.tables", false,
> "Indicates if repl dump should include information about external tables. It 
> should be \n"
> + "used in conjunction with 'hive.repl.dump.metadata.only' set to false. if 
> 'hive.repl.dump.metadata.only' \n"
> + " is set to true then this config parameter has no effect as external table 
> meta data is flushed \n"
> + " always by default.")
> This should be done for only replication dump and not for export



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19739) Bootstrap REPL LOAD to use checkpoints to validate and skip the loaded data/metadata.

2018-06-06 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502943#comment-16502943
 ] 

ASF GitHub Bot commented on HIVE-19739:
---

GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/366

HIVE-19739: Bootstrap REPL LOAD to use checkpoints to validate and skip the 
loaded data/metadata.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-19739

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/366.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #366


commit 89d6d9957c200ebad3a78a38400ea70efa5abdc2
Author: Sankar Hariappan 
Date:   2018-06-04T21:35:38Z

HIVE-19739: Bootstrap REPL LOAD to use checkpoints to validate and skip the 
loaded data/metadata.




> Bootstrap REPL LOAD to use checkpoints to validate and skip the loaded 
> data/metadata.
> -
>
> Key: HIVE-19739
> URL: https://issues.apache.org/jira/browse/HIVE-19739
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Fix For: 4.0.0
>
>
> Currently. bootstrap REPL LOAD have added checkpoint identifiers in 
> DB/table/partition object properties once the data/metadata related to the 
> object is successfully loaded.
> If the Db exist and is not empty, then currently we are throwing exception. 
> But need to support it for the retry scenario after a failure.
> If there is a retry of bootstrap load using the same dump, then instead of 
> throwing error, we should check if any of the tables/partitions are 
> completely loaded using the checkpoint identifiers. If yes, then skip it or 
> else drop/create them again.
> If the bootstrap load is performed using different dump, then it should throw 
> exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19569) alter table db1.t1 rename db2.t2 generates MetaStoreEventListener.onDropTable()

2018-06-06 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19569:
--
Labels: pull-request-available  (was: )

> alter table db1.t1 rename db2.t2 generates 
> MetaStoreEventListener.onDropTable()
> ---
>
> Key: HIVE-19569
> URL: https://issues.apache.org/jira/browse/HIVE-19569
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Standalone Metastore, Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19569.01.patch
>
>
> When renaming a table within the same DB, this operation causes 
> {{MetaStoreEventListener.onAlterTable()}} to fire but when changing DB name 
> for a table it causes {{MetaStoreEventListener.onDropTable()}} + 
> {{MetaStoreEventListener.onCreateTable()}}.
> The files from original table are moved to new table location.  
> This creates confusing semantics since any logic in {{onDropTable()}} doesn't 
> know about the larger context, i.e. that there will be a matching 
> {{onCreateTable()}}.
> In particular, this causes a problem for Acid tables since files moved from 
> old table use WriteIDs that are not meaningful with the context of new table.
> Current implementation is due to replication.  This should ideally be changed 
> to raise a "not supported" error for tables that are marked for replication.
> cc [~sankarh]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19569) alter table db1.t1 rename db2.t2 generates MetaStoreEventListener.onDropTable()

2018-06-06 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504187#comment-16504187
 ] 

ASF GitHub Bot commented on HIVE-19569:
---

GitHub user maheshk114 opened a pull request:

https://github.com/apache/hive/pull/368

HIVE-19569 : alter table db1.t1 rename db2.t2 generates 
MetaStoreEventListener.onDropTable()

changed create/drop table to alter table event

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maheshk114/hive BUG-104447

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/368.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #368


commit 498da7cfb697309a30cbfc723a708bfaaa9ef29e
Author: Mahesh Kumar Behera 
Date:   2018-06-06T04:57:39Z

HIVE-19569 : alter table db1.t1 rename db2.t2 generates 
MetaStoreEventListener.onDropTable()




> alter table db1.t1 rename db2.t2 generates 
> MetaStoreEventListener.onDropTable()
> ---
>
> Key: HIVE-19569
> URL: https://issues.apache.org/jira/browse/HIVE-19569
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Standalone Metastore, Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19569.01.patch
>
>
> When renaming a table within the same DB, this operation causes 
> {{MetaStoreEventListener.onAlterTable()}} to fire but when changing DB name 
> for a table it causes {{MetaStoreEventListener.onDropTable()}} + 
> {{MetaStoreEventListener.onCreateTable()}}.
> The files from original table are moved to new table location.  
> This creates confusing semantics since any logic in {{onDropTable()}} doesn't 
> know about the larger context, i.e. that there will be a matching 
> {{onCreateTable()}}.
> In particular, this causes a problem for Acid tables since files moved from 
> old table use WriteIDs that are not meaningful with the context of new table.
> Current implementation is due to replication.  This should ideally be changed 
> to raise a "not supported" error for tables that are marked for replication.
> cc [~sankarh]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19708) Repl copy retrying with cm path even if the failure is due to network issue

2018-05-27 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19708:
--
Labels: pull-request-available  (was: )

> Repl copy retrying with cm path even if the failure is due to network issue
> ---
>
> Key: HIVE-19708
> URL: https://issues.apache.org/jira/browse/HIVE-19708
> Project: Hive
>  Issue Type: Task
>  Components: Hive, HiveServer2, repl
>Affects Versions: 3.1.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-19708.01.patch, HIVE-19708.02.patch
>
>
> * During repl load
>  ** for filesystem based copying of file if the copy fails due to a 
> connection error to source Name Node, we should recreate the filesystem 
> object.
>  ** the retry logic for local file copy should be triggered using the 
> original source file path ( and not the CM root path ) since failure can be 
> due to network issues between DFSClient and NN.
>  * When listing files in tables / partition to include them in _files, we 
> should add retry logic when failure occurs. FileSystem object here also 
> should be recreated since the existing one might be in inconsistent state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19708) Repl copy retrying with cm path even if the failure is due to network issue

2018-05-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16492284#comment-16492284
 ] 

ASF GitHub Bot commented on HIVE-19708:
---

GitHub user maheshk114 opened a pull request:

https://github.com/apache/hive/pull/359

HIVE-19708 : Repl copy retrying with cm path even if the failure is d…



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maheshk114/hive BUG-102280

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/359.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #359


commit fc7377c8f265402f2bdc19ef79f0cf94b4fa5c44
Author: Mahesh Kumar Behera 
Date:   2018-05-25T03:43:52Z

HIVE-19708 : Repl copy retrying with cm path even if the failure is due to 
network issue




> Repl copy retrying with cm path even if the failure is due to network issue
> ---
>
> Key: HIVE-19708
> URL: https://issues.apache.org/jira/browse/HIVE-19708
> Project: Hive
>  Issue Type: Task
>  Components: Hive, HiveServer2, repl
>Affects Versions: 3.1.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-19708.01.patch, HIVE-19708.02.patch
>
>
> * During repl load
>  ** for filesystem based copying of file if the copy fails due to a 
> connection error to source Name Node, we should recreate the filesystem 
> object.
>  ** the retry logic for local file copy should be triggered using the 
> original source file path ( and not the CM root path ) since failure can be 
> due to network issues between DFSClient and NN.
>  * When listing files in tables / partition to include them in _files, we 
> should add retry logic when failure occurs. FileSystem object here also 
> should be recreated since the existing one might be in inconsistent state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19723) Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"

2018-05-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16492365#comment-16492365
 ] 

ASF GitHub Bot commented on HIVE-19723:
---

GitHub user pudidic opened a pull request:

https://github.com/apache/hive/pull/360

HIVE-19723: Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, 
null)"

Spark's Arrow support only provides Timestamp at MICROSECOND granularity. 
Spark 2.3.0 won't accept NANOSECOND. Switch it back to MICROSECOND.
The unit test org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow will just need 
to change the assertion to test microsecond. And we'll need to add this to 
documentation on supported datatypes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pudidic/hive HIVE-19723

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/360.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #360


commit 53e4224e05b2b6d96b19451716c41ae1eae7df68
Author: Teddy Choi 
Date:   2018-05-28T06:56:07Z

HIVE-19723: Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, 
null)"




> Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"
> -
>
> Key: HIVE-19723
> URL: https://issues.apache.org/jira/browse/HIVE-19723
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19723.1.patch
>
>
> Spark's Arrow support only provides Timestamp at MICROSECOND granularity. 
> Spark 2.3.0 won't accept NANOSECOND. Switch it back to MICROSECOND.
> The unit test org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow will just need 
> to change the assertion to test microsecond. And we'll need to add this to 
> documentation on supported datatypes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19723) Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"

2018-05-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19723:
--
Labels: pull-request-available  (was: )

> Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"
> -
>
> Key: HIVE-19723
> URL: https://issues.apache.org/jira/browse/HIVE-19723
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19723.1.patch
>
>
> Spark's Arrow support only provides Timestamp at MICROSECOND granularity. 
> Spark 2.3.0 won't accept NANOSECOND. Switch it back to MICROSECOND.
> The unit test org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow will just need 
> to change the assertion to test microsecond. And we'll need to add this to 
> documentation on supported datatypes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17840) HiveMetaStore eats exception if transactionalListeners.notifyEvent fail

2018-07-02 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529716#comment-16529716
 ] 

ASF GitHub Bot commented on HIVE-17840:
---

GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/385

HIVE-17840: HiveMetaStore eats exception if 
transactionalListeners.notifyEvent fail.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-17840

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/385.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #385


commit 0dbccfe51b83fa1f1842d96fb133be0c5bef4ebf
Author: Sankar Hariappan 
Date:   2018-07-02T11:09:51Z

HIVE-17840: HiveMetaStore eats exception if 
transactionalListeners.notifyEvent fail.




> HiveMetaStore eats exception if transactionalListeners.notifyEvent fail
> ---
>
> Key: HIVE-17840
> URL: https://issues.apache.org/jira/browse/HIVE-17840
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
>
> For example, in add_partitions_core, if there's exception in 
> MetaStoreListenerNotifier.notifyEvent(transactionalListeners,), 
> transaction rollback but no exception thrown. Client will assume add 
> partition is successful and take a positive path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17840) HiveMetaStore eats exception if transactionalListeners.notifyEvent fail

2018-07-02 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17840:
--
Labels: pull-request-available  (was: )

> HiveMetaStore eats exception if transactionalListeners.notifyEvent fail
> ---
>
> Key: HIVE-17840
> URL: https://issues.apache.org/jira/browse/HIVE-17840
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
>
> For example, in add_partitions_core, if there's exception in 
> MetaStoreListenerNotifier.notifyEvent(transactionalListeners,), 
> transaction rollback but no exception thrown. Client will assume add 
> partition is successful and take a positive path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-06-29 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17593:
--
Labels: pull-request-available  (was: )

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-06-29 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528513#comment-16528513
 ] 

ASF GitHub Bot commented on HIVE-17593:
---

GitHub user cjjnjust opened a pull request:

https://github.com/apache/hive/pull/383

HIVE-17593: DataWritableWriter strip spaces for CHAR type which cause…

Parquet DataWritableWriter strip tailing spaces for HiveChar type, which 
cause predicate push down failed to work due to ConvertAstToSearchArg 
constructs predicate with tailing space.  Actually, according to HiveChar 
definition, it should contains padded value. ParquetOutputFormat can handle 
tailing spaces through encoding. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cjjnjust/hive HIVE-17593

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/383.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #383


commit 03230c732d657706c6a95f90e16ed5c81d411af7
Author: Chen, Junjie 
Date:   2018-06-29T23:32:52Z

HIVE-17593: DataWritableWriter strip spaces for CHAR type which cause PPD 
not work




> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20025) Clean-up of event files created by HiveProtoLoggingHook.

2018-07-01 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-20025:
--
Labels: Hive hooks pull-request-available  (was: Hive hooks)

> Clean-up of event files created by HiveProtoLoggingHook.
> 
>
> Key: HIVE-20025
> URL: https://issues.apache.org/jira/browse/HIVE-20025
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: Hive, hooks, pull-request-available
> Fix For: 4.0.0
>
>
> Currently, HiveProtoLoggingHook write event data to hdfs. The number of files 
> can grow to very large numbers.
> Since the files are created under a folder with Date being a part of the 
> path, hive should have a way to clean up data older than a certain configured 
> time / date. This can be a job that can run with as little frequency as just 
> once a day.
> This time should be set to 1 week default. There should also be a sane upper 
> bound of # of files so that when a large cluster generates a lot of files 
> during a spike, we don't force the cluster fall over.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20025) Clean-up of event files created by HiveProtoLoggingHook.

2018-07-01 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529341#comment-16529341
 ] 

ASF GitHub Bot commented on HIVE-20025:
---

GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/384

HIVE-20025: Clean-up of event files created by HiveProtoLoggingHook.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-20025

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/384.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #384


commit 52c24baa28ed305f3be2b47f6246ffede0f08e6e
Author: Sankar Hariappan 
Date:   2018-07-01T17:18:06Z

HIVE-20025: Clean-up of event files created by HiveProtoLoggingHook.




> Clean-up of event files created by HiveProtoLoggingHook.
> 
>
> Key: HIVE-20025
> URL: https://issues.apache.org/jira/browse/HIVE-20025
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: Hive, hooks, pull-request-available
> Fix For: 4.0.0
>
>
> Currently, HiveProtoLoggingHook write event data to hdfs. The number of files 
> can grow to very large numbers.
> Since the files are created under a folder with Date being a part of the 
> path, hive should have a way to clean up data older than a certain configured 
> time / date. This can be a job that can run with as little frequency as just 
> once a day.
> This time should be set to 1 week default. There should also be a sane upper 
> bound of # of files so that when a large cluster generates a lot of files 
> during a spike, we don't force the cluster fall over.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19316) StatsTask fails due to ClassCastException

2018-06-22 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19316:
--
Labels: pull-request-available  (was: )

> StatsTask fails due to ClassCastException
> -
>
> Key: HIVE-19316
> URL: https://issues.apache.org/jira/browse/HIVE-19316
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Rui Li
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
>
> The stack trace:
> {noformat}
> 2018-04-26T20:17:37,674 ERROR [pool-7-thread-11] 
> metastore.RetryingHMSHandler: java.lang.ClassCastException: 
> org.apache.hadoop.hive.metastore.api.LongColumnStatsData cannot be cast to 
> org.apache.hadoop.hive.metastore.columnstats.cache.LongColumnStatsDataInspector
> at 
> org.apache.hadoop.hive.metastore.columnstats.merge.LongColumnStatsMerger.merge(LongColumnStatsMerger.java:30)
> at 
> org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.mergeColStats(MetaStoreUtils.java:1052)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:7202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy26.set_aggr_stats_for(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:16795)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:16779)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19316) StatsTask fails due to ClassCastException

2018-06-22 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520623#comment-16520623
 ] 

ASF GitHub Bot commented on HIVE-19316:
---

GitHub user beltran opened a pull request:

https://github.com/apache/hive/pull/378

HIVE-19316: StatsTask fails due to ClassCastException



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/beltran/hive HIVE-19316

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/378.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #378


commit a9566b22761aa1da14585ce829e6d65f9d272f48
Author: Jaume Marhuenda 
Date:   2018-06-22T17:46:35Z

HIVE-19316: StatsTask fails due to ClassCastException




> StatsTask fails due to ClassCastException
> -
>
> Key: HIVE-19316
> URL: https://issues.apache.org/jira/browse/HIVE-19316
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Rui Li
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
>
> The stack trace:
> {noformat}
> 2018-04-26T20:17:37,674 ERROR [pool-7-thread-11] 
> metastore.RetryingHMSHandler: java.lang.ClassCastException: 
> org.apache.hadoop.hive.metastore.api.LongColumnStatsData cannot be cast to 
> org.apache.hadoop.hive.metastore.columnstats.cache.LongColumnStatsDataInspector
> at 
> org.apache.hadoop.hive.metastore.columnstats.merge.LongColumnStatsMerger.merge(LongColumnStatsMerger.java:30)
> at 
> org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.mergeColStats(MetaStoreUtils.java:1052)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:7202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy26.set_aggr_stats_for(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:16795)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:16779)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19970) Replication dump has a NPE when table is empty

2018-06-24 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19970:
--
Labels: pull-request-available  (was: )

> Replication dump has a NPE when table is empty
> --
>
> Key: HIVE-19970
> URL: https://issues.apache.org/jira/browse/HIVE-19970
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19970.01.patch
>
>
> if table directory or partition directory is missing ..dump is throwing NPE 
> instead of file missing exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19970) Replication dump has a NPE when table is empty

2018-06-24 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16521408#comment-16521408
 ] 

ASF GitHub Bot commented on HIVE-19970:
---

GitHub user maheshk114 opened a pull request:

https://github.com/apache/hive/pull/379

HIVE-19970 : Replication dump has a NPE when table is empty



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maheshk114/hive BUG-105903

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/379.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #379


commit a1ad4c1d068ad1cd5ce6b8ec7170bff1b1a4f1b1
Author: Mahesh Kumar Behera 
Date:   2018-06-22T20:04:15Z

HIVE-19970 : Replication dump has a NPE when table is empty




> Replication dump has a NPE when table is empty
> --
>
> Key: HIVE-19970
> URL: https://issues.apache.org/jira/browse/HIVE-19970
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19970.01.patch
>
>
> if table directory or partition directory is missing ..dump is throwing NPE 
> instead of file missing exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20001) With doas set to true, running select query as hrt_qa user on external table fails due to permission denied to read /warehouse/tablespace/managed directory.

2018-06-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-20001:
--
Labels: pull-request-available  (was: )

> With doas set to true, running select query as hrt_qa user on external table 
> fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> 
>
> Key: HIVE-20001
> URL: https://issues.apache.org/jira/browse/HIVE-20001
> Project: Hive
>  Issue Type: Bug
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20001.1.patch
>
>
> Hive: With doas set to true, running select query as hrt_qa user on external 
> table fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> Steps: 
> 1. Create a external table.
> 2. Set doas to true.
> 3. run select count(*) using user hrt_qa.
> Table creation query.
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "drop table if exists test_table purge;
> create external table test_table(id int, age int) row format delimited fields 
> terminated by '|' stored as textfile;
> load data inpath '/tmp/table1.dat' overwrite into table test_table;
> {code}
> select count(*) query execution fails
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "select count(*) from test_table where age>30 and 
> id<10100;"
> 2018-06-22 10:22:29,328|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Class path contains 
> multiple SLF4J bindings.
> 2018-06-22 10:22:29,330|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: See 
> http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 2018-06-22 10:22:29,335|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Actual binding is of 
> type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2018-06-22 10:22:31,408|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Format tsv is deprecated, 
> please use tsv2
> 2018-06-22 10:22:31,529|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Connecting to 
> jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit
> 2018-06-22 10:22:32,031|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22 10:22:32 [main]: 
> INFO jdbc.HiveConnection: Connected to 
> ctr-e138-1518143905142-375925-01-04.hwx.site:10001
> 2018-06-22 10:22:34,130|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22 10:22:34 [main]: 
> WARN jdbc.HiveConnection: Failed to connect to 
> ctr-e138-1518143905142-375925-01-04.hwx.site:10001
> 2018-06-22 10:22:34,244|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22 10:22:34 [main]: 
> WARN jdbc.HiveConnection: Could not open client transport with JDBC Uri: 
> jdbc:hive2://ctr-e138-1518143905142-375925-01-04.hwx.site:10001/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit:
>  Failed to open new session: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:java.security.AccessControlException: Permission 
> denied: user=hrt_qa, access=READ, 
>

[jira] [Commented] (HIVE-20001) With doas set to true, running select query as hrt_qa user on external table fails due to permission denied to read /warehouse/tablespace/managed directory.

2018-06-26 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524295#comment-16524295
 ] 

ASF GitHub Bot commented on HIVE-20001:
---

GitHub user beltran opened a pull request:

https://github.com/apache/hive/pull/380

HIVE-20001: With doas set to true, running select query as hrt_qa use…

…r on external table fails due to permission denied to read 
/warehouse/tablespace/managed directory

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/beltran/hive HIVE-20001

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/380.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #380


commit 3e9dd9a73ae9d33e2f291819b0e10e4296f2b568
Author: Jaume Marhuenda 
Date:   2018-06-26T22:42:14Z

HIVE-20001: With doas set to true, running select query as hrt_qa user on 
external table fails due to permission denied to read 
/warehouse/tablespace/managed directory




> With doas set to true, running select query as hrt_qa user on external table 
> fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> 
>
> Key: HIVE-20001
> URL: https://issues.apache.org/jira/browse/HIVE-20001
> Project: Hive
>  Issue Type: Bug
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20001.1.patch
>
>
> Hive: With doas set to true, running select query as hrt_qa user on external 
> table fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> Steps: 
> 1. Create a external table.
> 2. Set doas to true.
> 3. run select count(*) using user hrt_qa.
> Table creation query.
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "drop table if exists test_table purge;
> create external table test_table(id int, age int) row format delimited fields 
> terminated by '|' stored as textfile;
> load data inpath '/tmp/table1.dat' overwrite into table test_table;
> {code}
> select count(*) query execution fails
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "select count(*) from test_table where age>30 and 
> id<10100;"
> 2018-06-22 10:22:29,328|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Class path contains 
> multiple SLF4J bindings.
> 2018-06-22 10:22:29,330|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: See 
> http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 2018-06-22 10:22:29,335|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Actual binding is of 
> type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2018-06-22 10:22:31,408|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Format tsv is deprecated, 
> please use tsv2
> 2018-06-22 10:22:31,529|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Connecting to 
> jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit
> 2018-06-22 10:22:32,031|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22 10:22:32 [main]: 
> INFO jdbc.HiveConnection: Connected to 
> ctr-e138-1518143905142-375925-01-04.hwx.site:10001
> 2018-06-22

[jira] [Commented] (HIVE-20001) With doas set to true, running select query as hrt_qa user on external table fails due to permission denied to read /warehouse/tablespace/managed directory.

2018-07-02 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530503#comment-16530503
 ] 

ASF GitHub Bot commented on HIVE-20001:
---

GitHub user beltran opened a pull request:

https://github.com/apache/hive/pull/389

HIVE-20001: With doas set to true, running select query as hrt_qa use…

…r on external table fails due to permission denied to read 
/warehouse/tablespace/managed directory

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/beltran/hive HIVE-20001-3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/389.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #389


commit 764d76de31f83b2c5985ad29410135faf4e32998
Author: Jaume Marhuenda 
Date:   2018-07-01T03:47:15Z

HIVE-20001: With doas set to true, running select query as hrt_qa user on 
external table fails due to permission denied to read 
/warehouse/tablespace/managed directory




> With doas set to true, running select query as hrt_qa user on external table 
> fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> 
>
> Key: HIVE-20001
> URL: https://issues.apache.org/jira/browse/HIVE-20001
> Project: Hive
>  Issue Type: Bug
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20001.1.patch, HIVE-20001.1.patch, 
> HIVE-20001.2.patch, HIVE-20001.3.patch
>
>
> Hive: With doas set to true, running select query as hrt_qa user on external 
> table fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> Steps: 
> 1. Create a external table.
> 2. Set doas to true.
> 3. run select count(*) using user hrt_qa.
> Table creation query.
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "drop table if exists test_table purge;
> create external table test_table(id int, age int) row format delimited fields 
> terminated by '|' stored as textfile;
> load data inpath '/tmp/table1.dat' overwrite into table test_table;
> {code}
> select count(*) query execution fails
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "select count(*) from test_table where age>30 and 
> id<10100;"
> 2018-06-22 10:22:29,328|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Class path contains 
> multiple SLF4J bindings.
> 2018-06-22 10:22:29,330|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: See 
> http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 2018-06-22 10:22:29,335|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Actual binding is of 
> type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2018-06-22 10:22:31,408|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Format tsv is deprecated, 
> please use tsv2
> 2018-06-22 10:22:31,529|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Connecting to 
> jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit
> 2018-06-22 10:22:32,031|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22 10:22:32 [main]: 
> INFO jdbc.HiveConnection: Connected to 
>

[jira] [Commented] (HIVE-20001) With doas set to true, running select query as hrt_qa user on external table fails due to permission denied to read /warehouse/tablespace/managed directory.

2018-07-02 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530276#comment-16530276
 ] 

ASF GitHub Bot commented on HIVE-20001:
---

GitHub user beltran opened a pull request:

https://github.com/apache/hive/pull/387

HIVE-20001: With doas set to true, running select query as hrt_qa use…

…r on external table fails due to permission denied to read 
/warehouse/tablespace/managed directory

Special attention to whether the appropriate create/upgrade sql scripts  
have been modified.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/beltran/hive HIVE-20001-2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/387.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #387


commit 7b722374af80172f947f78a80a11565d13ddd4a7
Author: Jaume Marhuenda 
Date:   2018-07-01T03:47:15Z

HIVE-20001: With doas set to true, running select query as hrt_qa user on 
external table fails due to permission denied to read 
/warehouse/tablespace/managed directory




> With doas set to true, running select query as hrt_qa user on external table 
> fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> 
>
> Key: HIVE-20001
> URL: https://issues.apache.org/jira/browse/HIVE-20001
> Project: Hive
>  Issue Type: Bug
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20001.1.patch, HIVE-20001.1.patch
>
>
> Hive: With doas set to true, running select query as hrt_qa user on external 
> table fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> Steps: 
> 1. Create a external table.
> 2. Set doas to true.
> 3. run select count(*) using user hrt_qa.
> Table creation query.
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "drop table if exists test_table purge;
> create external table test_table(id int, age int) row format delimited fields 
> terminated by '|' stored as textfile;
> load data inpath '/tmp/table1.dat' overwrite into table test_table;
> {code}
> select count(*) query execution fails
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "select count(*) from test_table where age>30 and 
> id<10100;"
> 2018-06-22 10:22:29,328|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Class path contains 
> multiple SLF4J bindings.
> 2018-06-22 10:22:29,330|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: See 
> http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 2018-06-22 10:22:29,335|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Actual binding is of 
> type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2018-06-22 10:22:31,408|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Format tsv is deprecated, 
> please use tsv2
> 2018-06-22 10:22:31,529|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Connecting to 
> jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit
> 2018-06-22 10:22:32,031|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22 10:22:32 [main]: 
> INFO jdbc.HiveConnection:

[jira] [Commented] (HIVE-20057) For ALTER TABLE t SET TBLPROPERTIES ('EXTERNAL'='TRUE'); `TBL_TYPE` attribute change not reflecting for non-CAPS

2018-07-02 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530379#comment-16530379
 ] 

ASF GitHub Bot commented on HIVE-20057:
---

GitHub user animenon opened a pull request:

https://github.com/apache/hive/pull/388

HIVE-20057: Fix Hive table conversion DESCRIBE table bug

Fix for #HIVE-20057

Issue: `Table Type` wrongly shown as `MANAGED_TABLE` after converting table 
from MANAGED to EXTERNAL using ` ALTER TABLE t SET TBLPROPERTIES 
('EXTERNAL'='True')` 

_(this is shown correctly only for `'EXTERNAL'='TRUE`)_


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/animenon/hive patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/388.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #388


commit 1a9674645c3b4e3080f5278f6bea3126b1cebbac
Author: Anirudh 
Date:   2018-07-02T19:53:03Z

HIVE-20057: Fix Hive table conversion DESCRIBE table bug

`equals` to `equalsIgnoreCase`




> For ALTER TABLE t SET TBLPROPERTIES ('EXTERNAL'='TRUE'); `TBL_TYPE` attribute 
> change not reflecting for non-CAPS
> 
>
> Key: HIVE-20057
> URL: https://issues.apache.org/jira/browse/HIVE-20057
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: All Versions
>Reporter: Anirudh
>Assignee: Anirudh
>Priority: Minor
>  Labels: pull-request-available
>
> Hive EXTERNAL table shown as MANAGED after conversion using 
>  
>     ALTER TABLE t SET TBLPROPERTIES ('EXTERNAL'='True')
>  
> The DESCRIBE FORMATTED shows:
> Table Type:            MANAGED_TABLE
> Table Parameters:
>                                EXTERNAL           True
>  
> This is actually a External table but shown wrongly as 'True' was used in 
> place of 'TRUE' in the ALTER statement.
> Issue explained here: 
> [StakOverflow - Hive Table is MANAGED or 
> EXTERNAL|https://stackoverflow.com/questions/51103317/hive-table-is-managed-or-external/51142873#51142873]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20057) For ALTER TABLE t SET TBLPROPERTIES ('EXTERNAL'='TRUE'); `TBL_TYPE` attribute change not reflecting for non-CAPS

2018-07-02 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-20057:
--
Labels: pull-request-available  (was: )

> For ALTER TABLE t SET TBLPROPERTIES ('EXTERNAL'='TRUE'); `TBL_TYPE` attribute 
> change not reflecting for non-CAPS
> 
>
> Key: HIVE-20057
> URL: https://issues.apache.org/jira/browse/HIVE-20057
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: All Versions
>Reporter: Anirudh
>Assignee: Anirudh
>Priority: Minor
>  Labels: pull-request-available
>
> Hive EXTERNAL table shown as MANAGED after conversion using 
>  
>     ALTER TABLE t SET TBLPROPERTIES ('EXTERNAL'='True')
>  
> The DESCRIBE FORMATTED shows:
> Table Type:            MANAGED_TABLE
> Table Parameters:
>                                EXTERNAL           True
>  
> This is actually a External table but shown wrongly as 'True' was used in 
> place of 'TRUE' in the ALTER statement.
> Issue explained here: 
> [StakOverflow - Hive Table is MANAGED or 
> EXTERNAL|https://stackoverflow.com/questions/51103317/hive-table-is-managed-or-external/51142873#51142873]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20001) With doas set to true, running select query as hrt_qa user on external table fails due to permission denied to read /warehouse/tablespace/managed directory.

2018-07-02 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530360#comment-16530360
 ] 

ASF GitHub Bot commented on HIVE-20001:
---

Github user beltran closed the pull request at:

https://github.com/apache/hive/pull/387


> With doas set to true, running select query as hrt_qa user on external table 
> fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> 
>
> Key: HIVE-20001
> URL: https://issues.apache.org/jira/browse/HIVE-20001
> Project: Hive
>  Issue Type: Bug
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20001.1.patch, HIVE-20001.1.patch, 
> HIVE-20001.2.patch
>
>
> Hive: With doas set to true, running select query as hrt_qa user on external 
> table fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> Steps: 
> 1. Create a external table.
> 2. Set doas to true.
> 3. run select count(*) using user hrt_qa.
> Table creation query.
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "drop table if exists test_table purge;
> create external table test_table(id int, age int) row format delimited fields 
> terminated by '|' stored as textfile;
> load data inpath '/tmp/table1.dat' overwrite into table test_table;
> {code}
> select count(*) query execution fails
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "select count(*) from test_table where age>30 and 
> id<10100;"
> 2018-06-22 10:22:29,328|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Class path contains 
> multiple SLF4J bindings.
> 2018-06-22 10:22:29,330|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: See 
> http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 2018-06-22 10:22:29,335|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Actual binding is of 
> type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2018-06-22 10:22:31,408|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Format tsv is deprecated, 
> please use tsv2
> 2018-06-22 10:22:31,529|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Connecting to 
> jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit
> 2018-06-22 10:22:32,031|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22 10:22:32 [main]: 
> INFO jdbc.HiveConnection: Connected to 
> ctr-e138-1518143905142-375925-01-04.hwx.site:10001
> 2018-06-22 10:22:34,130|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22 10:22:34 [main]: 
> WARN jdbc.HiveConnection: Failed to connect to 
> ctr-e138-1518143905142-375925-01-04.hwx.site:10001
> 2018-06-22 10:22:34,244|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22 10:22:34 [main]: 
> WARN jdbc.HiveConnection: Could not open client transport with JDBC Uri: 
> jdbc:hive2://ctr-e138-1518143905142-375925-01-04.hwx.site:10001/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit:
>  Failed to open new session: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
>

[jira] [Updated] (HIVE-20052) Arrow serde should fill ArrowColumnVector(Decimal) with the given schema precision/scale

2018-07-02 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-20052:
--
Labels: pull-request-available  (was: )

> Arrow serde should fill ArrowColumnVector(Decimal) with the given schema 
> precision/scale
> 
>
> Key: HIVE-20052
> URL: https://issues.apache.org/jira/browse/HIVE-20052
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20052.patch
>
>
> Arrow serde should fill ArrowColumnVector with given precision and scale. 
> When it serializes negative values into Arrow, it throws exceptions that the 
> precision of the value is not same with the precision of Arrow decimal vector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20052) Arrow serde should fill ArrowColumnVector(Decimal) with the given schema precision/scale

2018-07-02 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530163#comment-16530163
 ] 

ASF GitHub Bot commented on HIVE-20052:
---

GitHub user pudidic opened a pull request:

https://github.com/apache/hive/pull/386

HIVE-20052: Arrow serde should fill ArrowColumnVector(Decimal) with t…

…he given schema precision/scale (Teddy Choi)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pudidic/hive HIVE-20052

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/386.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #386


commit 574485f81609820601bb20557de50143ec56a0d7
Author: Teddy Choi 
Date:   2018-07-02T16:41:40Z

HIVE-20052: Arrow serde should fill ArrowColumnVector(Decimal) with the 
given schema precision/scale (Teddy Choi)




> Arrow serde should fill ArrowColumnVector(Decimal) with the given schema 
> precision/scale
> 
>
> Key: HIVE-20052
> URL: https://issues.apache.org/jira/browse/HIVE-20052
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20052.patch
>
>
> Arrow serde should fill ArrowColumnVector with given precision and scale. 
> When it serializes negative values into Arrow, it throws exceptions that the 
> precision of the value is not same with the precision of Arrow decimal vector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19340) Disable timeout of transactions opened by replication task at target cluster

2018-04-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455740#comment-16455740
 ] 

ASF GitHub Bot commented on HIVE-19340:
---

GitHub user maheshk114 opened a pull request:

https://github.com/apache/hive/pull/337

HIVE-19340 : Disable timeout of transactions opened by replication ta…


The transactions opened by applying EVENT_OPEN_TXN should never be aborted 
automatically due to time-out. Aborting of transaction started by replication 
task may leads to inconsistent state at target which needs additional overhead 
to clean-up. So, it is proposed to mark the transactions opened by replication 
task as special ones and shouldn't be aborted if heart beat is lost. This helps 
to ensure all ABORT and COMMIT events will always find the corresponding txn at 
target to operate.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maheshk114/hive BUG-92700

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/337.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #337


commit 317d29c8455ad8aaccf1689c66d79f7bab41cde7
Author: Mahesh Kumar Behera 
Date:   2018-04-27T03:24:08Z

HIVE-19340 : Disable timeout of transactions opened by replication task at 
target cluster




> Disable timeout of transactions opened by replication task at target cluster
> 
>
> Key: HIVE-19340
> URL: https://issues.apache.org/jira/browse/HIVE-19340
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 3.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-19340.01.patch
>
>
> The transactions opened by applying EVENT_OPEN_TXN should never be aborted 
> automatically due to time-out. Aborting of transaction started by replication 
> task may leads to inconsistent state at target which needs additional 
> overhead to clean-up. So, it is proposed to mark the transactions opened by 
> replication task as special ones and shouldn't be aborted if heart beat is 
> lost. This helps to ensure all ABORT and COMMIT events will always find the 
> corresponding txn at target to operate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19130) NPE is thrown when REPL LOAD applied drop partition event.

2018-05-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463378#comment-16463378
 ] 

ASF GitHub Bot commented on HIVE-19130:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/332


> NPE is thrown when REPL LOAD applied drop partition event.
> --
>
> Key: HIVE-19130
> URL: https://issues.apache.org/jira/browse/HIVE-19130
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-19130.01.patch
>
>
> During incremental replication, if we split the events batch as follows, then 
> the REPL LOAD on second batch throws NPE.
> Batch-1: CREATE_TABLE(t1) -> ADD_PARTITION(t1.p1) -> DROP_PARTITION (t1.p1)
> Batch-2: DROP_TABLE(t1) ->  CREATE_TABLE(t1) -> ADD_PARTITION(t1.p1) -> 
> DROP_PARTITION (t1.p1)
> {code}
> 2018-04-05 16:20:36,531 ERROR [HiveServer2-Background-Pool: Thread-107044]: 
> metadata.Hive (Hive.java:getTable(1219)) - Table catalog_sales_new not found: 
> new5_tpcds_real_bin_partitioned_orc_1000.catalog_sales_new table not found
> 2018-04-05 16:20:36,538 ERROR [HiveServer2-Background-Pool: Thread-107044]: 
> exec.DDLTask (DDLTask.java:failed(540)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4016)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3983)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:341)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1765)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1506)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1303)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1165)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByExpr(Hive.java:2613)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4008)
> ... 23 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18988) Support bootstrap replication of ACID tables

2018-05-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463377#comment-16463377
 ] 

ASF GitHub Bot commented on HIVE-18988:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/331


> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-18988.01-branch-3.patch, HIVE-18988.01.patch, 
> HIVE-18988.02.patch, HIVE-18988.03.patch, HIVE-18988.04.patch, 
> HIVE-18988.05.patch, HIVE-18988.06.patch, HIVE-18988.07.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18864) ValidWriteIdList snapshot seems incorrect if obtained after allocating writeId by current transaction.

2018-05-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463376#comment-16463376
 ] 

ASF GitHub Bot commented on HIVE-18864:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/316


> ValidWriteIdList snapshot seems incorrect if obtained after allocating 
> writeId by current transaction.
> --
>
> Key: HIVE-18864
> URL: https://issues.apache.org/jira/browse/HIVE-18864
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18864.01.patch, HIVE-18864.02.patch
>
>
> For multi-statement txns, it is possible that write on a table happens after 
> a read. Let's see the below scenario.
>  # Committed txn=9 writes on table T1 with writeId=5.
>  # Open txn=10. ValidTxnList(open:null, txn_HWM=10),
>  # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5).
>  # Open txn=11, writes on table T1 with writeid=6.
>  # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5).
>  # Write table T1 from txn=10 with writeId=7.
>  # Read table T1 from txn=10. {color:#d04437}*ValidWriteIdList(open:null, 
> write_HWM=7)*. – This read will able to see rows added by txn=11 which is 
> still open.{color}
> {color:#d04437}So, it is needed to rebuild the open/aborted list of 
> ValidWriteIdList based on txn_HWM. Any writeId allocated by txnId > txn_HWM 
> should be marked as open. In this example, *ValidWriteIdList(open:6, 
> write_HWM=7)* should be generated.{color}
> {color:#33}cc{color} [~ekoifman], [~thejas]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16480) ORC file with empty array and array fails to read

2017-12-27 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-16480:
--
Labels: pull-request-available  (was: )

> ORC file with empty array and array fails to read
> 
>
> Key: HIVE-16480
> URL: https://issues.apache.org/jira/browse/HIVE-16480
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: David Capwell
>Assignee: Owen O'Malley
>  Labels: pull-request-available
>
> We have a schema that has a array in it.  We were unable to read this 
> file and digging into ORC it seems that the issue is when the array is empty.
> Here is the stack trace
> {code:title=EmptyList.log|borderStyle=solid}
> ERROR 2017-04-19 09:29:17,075 [main] [EmptyList] [line 56] Failed to work 
> with type float 
> java.io.IOException: Error reading file: 
> /var/folders/t8/t5x1031d7mn17f6xpwnkkv_4gn/T/1492619355819-0/file-float.orc
>   at 
> org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1052) 
> ~[hive-orc-2.1.1.jar:2.1.1]
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:135)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>   at EmptyList.emptyList(EmptyList.java:49) ~[test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_121]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_121]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_121]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) 
> [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363) 
> [junit-4.12.jar:4.12]
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137) [junit-4.12.jar:4.12]
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>  [junit-rt.jar:na]
>   at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
>  [junit-rt.jar:na]
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237)
>  [junit-rt.jar:na]
>   at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) 
> [junit-rt.jar:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_121]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_121]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_121]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
>   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147) 
> [idea_rt.jar:na]
> Caused by: java.io.EOFException: Read past EOF for compressed stream Stream 
> for column 1 kind DATA position: 0 length: 0 range: 0 offset: 0 limit: 0
>   at 
> org.apache.orc.impl.SerializationUtils.readFully(SerializationUtils.java:118) 
> ~[hive-orc-2.1.1.jar:2.1.1]
>   at 
> org.apache.orc.impl.SerializationUtils.readFloat(SerializationUtils.java:78) 
> ~[hive-orc-2.1.1.jar:2.1.1]
>   at 
> org.apache.orc.impl.TreeReaderFactory$FloatTreeReader.nextVector(TreeReaderFactory.java:619)
>  ~[hive-orc-2.1.1.jar:2.1.1]
>   at 
> org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
>  ~[hive-orc-2.1.1.jar:2.1.1]
>   at 
> org.apache.orc.impl.TreeReaderFactory$TreeReader.nextBatch(TreeReaderFactory.java:154)
>

[jira] [Commented] (HIVE-16480) ORC file with empty array and array fails to read

2017-12-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16304707#comment-16304707
 ] 

ASF GitHub Bot commented on HIVE-16480:
---

GitHub user omalley opened a pull request:

https://github.com/apache/hive/pull/285

HIVE-16480 (ORC-285) Empty vector batches of floats or doubles gets

EOFException.

Signed-off-by: Owen O'Malley 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/omalley/hive hive-16480-2.1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/285.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #285


commit 43d7fe2f0fc9baeb311814da1f7a65cfd546145b
Author: Owen O'Malley 
Date:   2017-12-27T17:45:25Z

HIVE-16480 (ORC-285) Empty vector batches of floats or doubles gets
EOFException.

Signed-off-by: Owen O'Malley 




> ORC file with empty array and array fails to read
> 
>
> Key: HIVE-16480
> URL: https://issues.apache.org/jira/browse/HIVE-16480
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: David Capwell
>Assignee: Owen O'Malley
>  Labels: pull-request-available
>
> We have a schema that has a array in it.  We were unable to read this 
> file and digging into ORC it seems that the issue is when the array is empty.
> Here is the stack trace
> {code:title=EmptyList.log|borderStyle=solid}
> ERROR 2017-04-19 09:29:17,075 [main] [EmptyList] [line 56] Failed to work 
> with type float 
> java.io.IOException: Error reading file: 
> /var/folders/t8/t5x1031d7mn17f6xpwnkkv_4gn/T/1492619355819-0/file-float.orc
>   at 
> org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1052) 
> ~[hive-orc-2.1.1.jar:2.1.1]
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:135)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>   at EmptyList.emptyList(EmptyList.java:49) ~[test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_121]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_121]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_121]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) 
> [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363) 
> [junit-4.12.jar:4.12]
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137) [junit-4.12.jar:4.12]
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>  [junit-rt.jar:na]
>   at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
>  [junit-rt.jar:na]
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237)
>  [junit-rt.jar:na]
>   at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) 
> [junit-rt.jar:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_121]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_121]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_121]
>   at

[jira] [Commented] (HIVE-17580) Remove dependency of get_fields_with_environment_context API to serde

2018-01-05 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314309#comment-16314309
 ] 

ASF GitHub Bot commented on HIVE-17580:
---

GitHub user vihangk1 opened a pull request:

https://github.com/apache/hive/pull/287

HIVE-17580 : Remove standalone-metastore's dependency with serdes

Removing the dependency on serdes for the metastore requires a series of 
changes. I have created multiple commits which hopefully would be easier to 
review. Each major commit has a descriptive commit message to give a high level 
idea of what the change is doing. There are still some bits which need to be 
completed but it would be good to a review.

Overview of all the changes done:
1. Creates a new module called serde-api under storage-api like discussed. 
Although I think we can keep it separate as well.
2. Moved List, Map, Struct, Constant, Primitive, Union ObjectInspectors to 
serde-api
3. Moved PrimitiveTypeInfo, PrimitiveTypeEntry and TypeInfo to serde-api.
4. Moved TypeInfoParser, TypeInfoFactory to serde-api
5. Added a new class which reading avro storage schema by copying the code 
from AvroSerde and AvroSerdeUtils. The parsing is done such that String value 
is first converted into TypeInfos and then into FieldSchemas bypassing the need 
for ObjectInspectors. In theory we could get rid of TypeInfos as well but that 
path was getting too difficult with lot of duplicate code between Hive and 
metastore.
6. Introduces a default storage schema reader. I noticed that most of the 
serdes use the same logic to parse the metadata information. This code should 
be refactored to a common place instead of having many copies (one in 
standalone hms and another set in multiple serdes)
7. Moved HiveChar, HiveVarchar, HiveCharWritable, HiveVarcharWritable to 
storage-api. I noticed that HiveDecimal is already in storage-api. It probably 
makes sense to move the other primitive types (timestamp, interval etc)to 
storage-api as well but it requires storage-api to be upgraded to Java 8.
8. Adds a basic test for the schema reader. I plan to add more tests as 
this code is reviewed.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vihangk1/hive vihangk1_HIVE-17580

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/287.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #287


commit bbfb7dc44904db74a840167c02b07f50a6010b69
Author: Vihang Karajgaonkar 
Date:   2017-11-09T00:52:39Z

HIVE-17580 : Remove dependency of get_fields_with_environment_context API 
to serde

commit d54879845eff10c19bc17bda9e09dda16f6fa295
Author: Vihang Karajgaonkar 
Date:   2017-11-29T00:54:23Z

Moved List, Map, Struct OI to storage-api

commit a12d6c7ba3de598c6b6f75da1bd4efcac43036b1
Author: Vihang Karajgaonkar 
Date:   2017-11-29T04:19:39Z

Moved ConstantObjectInspector PrimitiveObjectInspector and 
UnionObjectInspector

commit 13fb832fc2d51958e75d5e609f6781f87449aed8
Author: Vihang Karajgaonkar 
Date:   2017-12-28T01:25:59Z

Moved PrimitiveTypeInfo to serde-api

In order to move PrimitiveTypeInfo we need to move the PrimitiveTypeEntry 
as well.
PrimitiveTypeEntry depends on PrimitiveObjectInspectorUtils which cannot be 
pulled
into serde-api. Hence the static final maps are moved to PrimitiveEntry and 
we provide
static access methods to these maps along with the register method to add 
the key
value pairs in the maps

commit 9cbc789fd3f4ced7ce66a7313c451b75a154976f
Author: Vihang Karajgaonkar 
Date:   2017-12-28T20:51:39Z

Moved the other TypeInfos to serde-api

In order to move the other TypeInfo classes to serde-api we need to move 
the serdeConstants.java
as well. This is a thrift generated class. This commit copies the 
serde.thrift instead of moving.
The only reason I did not move it is in case of backwards compatibility 
reasons (in case someone is using the thrift file location to do something).
If it is okay to move serde.thrift from serde module to serde-api we can 
delete it from serde module in a separate
change.

The other concern is there are some TypeInfo classes which do some 
validation like VarCharTypeInfo, DecimalTypeInfo.
The validating methods use the actual type implementation like HiveChar, 
HiveDecimal etc to ensure that the params
are under the correct limits. This creates a problem since we cannot bring 
in the type implementations as well to
serde-api. Currently, I have marked these as TODO and commented them out.

commit 23aa899a90d17648e560d39602a3bd29bf53661e
Author: Vihang Karajgaonkar 
Date:

[jira] [Updated] (HIVE-17580) Remove dependency of get_fields_with_environment_context API to serde

2018-01-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17580:
--
Labels: pull-request-available  (was: )

> Remove dependency of get_fields_with_environment_context API to serde
> -
>
> Key: HIVE-17580
> URL: https://issues.apache.org/jira/browse/HIVE-17580
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>  Labels: pull-request-available
>
> {{get_fields_with_environment_context}} metastore API uses {{Deserializer}} 
> class to access the fields metadata for the cases where it is stored along 
> with the data files (avro tables). The problem is Deserializer classes is 
> defined in hive-serde module and in order to make metastore independent of 
> Hive we will have to remove this dependency (atleast we should change it to 
> runtime dependency instead of compile time).
> The other option is investigate if we can use SearchArgument to provide this 
> functionality.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18338) [Client, JDBC] Asynchronous interface through hive JDBC.

2017-12-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-18338:
--
Labels: pull-request-available  (was: )

> [Client, JDBC] Asynchronous interface through hive JDBC.
> 
>
> Key: HIVE-18338
> URL: https://issues.apache.org/jira/browse/HIVE-18338
> Project: Hive
>  Issue Type: Improvement
>  Components: Clients, JDBC
>Affects Versions: 2.3.2
>Reporter: Amruth S
>Assignee: Amruth S
>Priority: Minor
>  Labels: pull-request-available
>
> Lot of users are struggling and rewriting a lot of boiler plate over thrift 
> to get pure asynchronous capability. 
> The idea is to expose operation handle, so that clients can persist it and 
> later can latch on to the same execution.
> Let me know your ideas around this. We have solved this already at our org by 
> tweaking HiveStatement.java.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18338) [Client, JDBC] Asynchronous interface through hive JDBC.

2017-12-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16303270#comment-16303270
 ] 

ASF GitHub Bot commented on HIVE-18338:
---

GitHub user amrk7s opened a pull request:

https://github.com/apache/hive/pull/284

HIVE-18338 Exposing asynchronous execution through hive-jdbc client

**Problem statement**

Hive JDBC currently exposes 2 methods related to asynchronous execution
**executeAsync()** - to trigger a query execution and return immediately.
**waitForOperationToComplete()** - which waits till the current execution 
is complete **blocking the user thread**. 

This has one problem
- If the client process goes down, there is no way to resume queries 
although hive server is completely asynchronous. 

**Proposal**

If operation handle could be exposed, we can latch on to an active 
execution of a query.

**Code changes**

Operation handle is exposed. So client can keep a copy.
latchSync() and latchAsync() methods take an operation handle and try to 
latch on to the current execution in hive server if present

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Flipkart/hive async_jdbc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/284.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #284


commit 9afa0ae9a9e2c38be3fbdadab230fdd399ab8e5b
Author: amrk7s 
Date:   2017-12-25T13:20:22Z

HIVE-18338 Exposing asynchronous execution through hive-jdbc client




> [Client, JDBC] Asynchronous interface through hive JDBC.
> 
>
> Key: HIVE-18338
> URL: https://issues.apache.org/jira/browse/HIVE-18338
> Project: Hive
>  Issue Type: Improvement
>  Components: Clients, JDBC
>Affects Versions: 2.3.2
>Reporter: Amruth S
>Assignee: Amruth S
>Priority: Minor
>  Labels: pull-request-available
>
> Lot of users are struggling and rewriting a lot of boiler plate over thrift 
> to get pure asynchronous capability. 
> The idea is to expose operation handle, so that clients can persist it and 
> later can latch on to the same execution.
> Let me know your ideas around this. We have solved this already at our org by 
> tweaking HiveStatement.java.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18341) Add repl load support for adding "raw" namespace for TDE with same encryption keys

2018-01-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321674#comment-16321674
 ] 

ASF GitHub Bot commented on HIVE-18341:
---

GitHub user anishek opened a pull request:

https://github.com/apache/hive/pull/289

HIVE-18341: Add repl load support for adding "raw" namespace for TDE with 
same encryption keys



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anishek/hive HIVE-18341

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/289.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #289


commit 14b92575fdc97434ec65ad0ce1c54c5f352a992c
Author: Anishek Agarwal 
Date:   2017-12-26T14:11:39Z

HIVE-18341: Add repl load support for adding "raw" namespace for TDE with 
same encryption keys




> Add repl load support for adding "raw" namespace for TDE with same encryption 
> keys
> --
>
> Key: HIVE-18341
> URL: https://issues.apache.org/jira/browse/HIVE-18341
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18341.0.patch, HIVE-18341.1.patch
>
>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html#Running_as_the_superuser
> "a new virtual path prefix, /.reserved/raw/, that gives superusers direct 
> access to the underlying block data in the filesystem. This allows superusers 
> to distcp data without needing having access to encryption keys, and also 
> avoids the overhead of decrypting and re-encrypting data."
> We need to introduce a new option in "Repl Load" command that will change the 
> files being copied in distcp to have this "/.reserved/raw/" namespace before 
> the file paths.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18341) Add repl load support for adding "raw" namespace for TDE with same encryption keys

2018-01-10 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-18341:
--
Labels: pull-request-available  (was: )

> Add repl load support for adding "raw" namespace for TDE with same encryption 
> keys
> --
>
> Key: HIVE-18341
> URL: https://issues.apache.org/jira/browse/HIVE-18341
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18341.0.patch, HIVE-18341.1.patch
>
>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html#Running_as_the_superuser
> "a new virtual path prefix, /.reserved/raw/, that gives superusers direct 
> access to the underlying block data in the filesystem. This allows superusers 
> to distcp data without needing having access to encryption keys, and also 
> avoids the overhead of decrypting and re-encrypting data."
> We need to introduce a new option in "Repl Load" command that will change the 
> files being copied in distcp to have this "/.reserved/raw/" namespace before 
> the file paths.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18352) introduce a METADATAONLY option while doing REPL DUMP to allow integrations of other tools

2018-01-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312651#comment-16312651
 ] 

ASF GitHub Bot commented on HIVE-18352:
---

GitHub user anishek opened a pull request:

https://github.com/apache/hive/pull/286

HIVE-18352: introduce a METADATAONLY option while doing REPL DUMP to allow 
integrations of other tools



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anishek/hive HIVE-18352

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/286.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #286


commit c03814bd857cfd70a40aa7a5ec674e73cbfc63f9
Author: Anishek Agarwal 
Date:   2018-01-03T10:27:04Z

HIVE-18352: introduce a METADATAONLY option while doing REPL DUMP to allow 
integrations of other tools




> introduce a METADATAONLY option while doing REPL DUMP to allow integrations 
> of other tools 
> ---
>
> Key: HIVE-18352
> URL: https://issues.apache.org/jira/browse/HIVE-18352
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18352.0.patch
>
>
> * Introduce a METADATAONLY option as part of the REPL DUMP command which will 
> only try and dump out events for DDL changes, this will be faster as we wont 
> need  scan of files on HDFS for DML changes. 
> * Additionally since we are only going to dump metadata operations, it might 
> be useful to include acid tables as well via an option as well. This option 
> can be removed when ACID support is complete via HIVE-18320
> it will be good to support the "WITH" clause as part of REPL DUMP command as 
> well (repl dump already supports it viaHIVE-17757) to achieve the above as 
> that will prevent less changes to the syntax of the statement and provide 
> more flexibility in future to include additional options as well. 
> {code}
> REPL DUMP [db_name] {FROM [event_id]} {TO [event_id]} {WITH 
> (['key'='value'],.)}
> {code}
> This will enable other tools like security / schema registry /  metadata 
> discovery to use replication related subsystem for their needs as well. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18352) introduce a METADATAONLY option while doing REPL DUMP to allow integrations of other tools

2018-01-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-18352:
--
Labels: pull-request-available  (was: )

> introduce a METADATAONLY option while doing REPL DUMP to allow integrations 
> of other tools 
> ---
>
> Key: HIVE-18352
> URL: https://issues.apache.org/jira/browse/HIVE-18352
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18352.0.patch
>
>
> * Introduce a METADATAONLY option as part of the REPL DUMP command which will 
> only try and dump out events for DDL changes, this will be faster as we wont 
> need  scan of files on HDFS for DML changes. 
> * Additionally since we are only going to dump metadata operations, it might 
> be useful to include acid tables as well via an option as well. This option 
> can be removed when ACID support is complete via HIVE-18320
> it will be good to support the "WITH" clause as part of REPL DUMP command as 
> well (repl dump already supports it viaHIVE-17757) to achieve the above as 
> that will prevent less changes to the syntax of the statement and provide 
> more flexibility in future to include additional options as well. 
> {code}
> REPL DUMP [db_name] {FROM [event_id]} {TO [event_id]} {WITH 
> (['key'='value'],.)}
> {code}
> This will enable other tools like security / schema registry /  metadata 
> discovery to use replication related subsystem for their needs as well. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18423) Hive should support usage of external tables using jdbc

2018-01-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16320480#comment-16320480
 ] 

ASF GitHub Bot commented on HIVE-18423:
---

GitHub user msydoron opened a pull request:

https://github.com/apache/hive/pull/288

HIVE-18423

Added full support for jdbc external tables in hive.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shmushkis/hive master_yoni

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/288.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #288


commit 4e5dbb01f8d509bc18a82943c27ec62691baa0f4
Author: msydoron 
Date:   2017-12-19T09:02:57Z

Integrate jethro jdbc updates into master_yoni

commit b18008e5bea82528bbd269da129831b3c333ff9c
Author: msydoron 
Date:   2017-12-19T11:21:15Z

jethro integration added missing file HiveRelColumnsAlignment.java

commit 960f5bbe6bfcbfc9447d6b386a184827edb74a03
Author: msydoron 
Date:   2017-12-19T13:01:27Z

Updated MyRules to use the jdbc convention from the converter.

commit 0a290845600a19692f8de8bbb99f344a49633a38
Author: msydoron 
Date:   2017-12-20T14:21:59Z

Added support for hive quering through jdbc for all jethro types.
Removed dead code and refactor

commit 52317dba6ec78f825db789ea976f91de864dbb1e
Author: msydoron 
Date:   2017-12-21T11:57:54Z

Fixed count(*) for HiveSqlCountAggFunction
Fixed MySortRule
Fixed addLimitToQuery() for sorted queries

commit b4a6c87cfa3aba0513d11a21f4e00ed55d02b3be
Author: msydoron 
Date:   2017-12-24T16:20:48Z

Invoke the jdbc rules after calcite invokes its rules

commit 3a2d1e5af73ba1147cf3f697d6e3c27f3f9ce262
Author: msydoron 
Date:   2017-12-31T15:04:34Z

Added proto support for jethro 'show functions'

commit cd59a56d03eb816a51057643654cba04036d9289
Author: msydoron 
Date:   2018-01-02T13:16:18Z

Fixed some issues raised by 'show functions' support

commit 4637c02f8708a468ac9b93c52e04b24bf478b590
Author: msydoron 
Date:   2018-01-02T16:20:40Z

Initialize the dialect where it generated

commit 23ab2538312f414838245130ae8a5d61fe35844a
Author: msydoron 
Date:   2018-01-08T11:08:24Z

Code refactor




> Hive should support usage of external tables using jdbc
> ---
>
> Key: HIVE-18423
> URL: https://issues.apache.org/jira/browse/HIVE-18423
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jonathan Doron
>Assignee: Jonathan Doron
>  Labels: pull-request-available
> Fix For: 3.0.0
>
>
> Hive should support the usage of external jdbc tables(and not only external 
> tables that hold queries), so an Hive user would be able to use the external 
> table as an hive internal table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18423) Hive should support usage of external tables using jdbc

2018-01-10 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-18423:
--
Labels: pull-request-available  (was: )

> Hive should support usage of external tables using jdbc
> ---
>
> Key: HIVE-18423
> URL: https://issues.apache.org/jira/browse/HIVE-18423
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jonathan Doron
>Assignee: Jonathan Doron
>  Labels: pull-request-available
> Fix For: 3.0.0
>
>
> Hive should support the usage of external jdbc tables(and not only external 
> tables that hold queries), so an Hive user would be able to use the external 
> table as an hive internal table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17982) Move metastore specific itests

2018-01-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327860#comment-16327860
 ] 

ASF GitHub Bot commented on HIVE-17982:
---

Github user asfgit closed the pull request at:

https://github.com/apache/hive/pull/279


> Move metastore specific itests
> --
>
> Key: HIVE-17982
> URL: https://issues.apache.org/jira/browse/HIVE-17982
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17982.2.patch, HIVE-17982.patch
>
>
> There are a number of tests in itests/hive-unit/.../metastore that are 
> metastore specific.  I suspect they were initially placed in itests only 
> because the metastore pulling in a few plugins from ql.
> Given that we need to be able to release the metastore separately, we need to 
> be able to test it completely as a standalone entity.  So I propose to move a 
> number of the itests over into standalone-metastore.  I will only move tests 
> that are isolated to the metastore.  Anything that tests wider functionality 
> I plan to leave in itests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17983) Make the standalone metastore generate tarballs etc.

2018-01-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327915#comment-16327915
 ] 

ASF GitHub Bot commented on HIVE-17983:
---

GitHub user alanfgates opened a pull request:

https://github.com/apache/hive/pull/291

HIVE-17983 Make the standalone metastore generate tarballs etc.

See JIRA for full comments.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alanfgates/hive hive17983

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/291.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #291


commit 1ba9b62d9ef488355e1a97dbc7237c1472349a24
Author: Alan Gates 
Date:   2017-10-19T23:49:38Z

HIVE-17983 Make the standalone metastore generate tarballs etc.




> Make the standalone metastore generate tarballs etc.
> 
>
> Key: HIVE-17983
> URL: https://issues.apache.org/jira/browse/HIVE-17983
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
>
> In order to be separately installable the standalone metastore needs its own 
> tarballs, startup scripts, etc.  All of the SQL installation and upgrade 
> scripts also need to move from metastore to standalone-metastore.
> I also plan to create Dockerfiles for different database types so that 
> developers can test the SQL installation and upgrade scripts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17983) Make the standalone metastore generate tarballs etc.

2018-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17983:
--
Labels: pull-request-available  (was: )

> Make the standalone metastore generate tarballs etc.
> 
>
> Key: HIVE-17983
> URL: https://issues.apache.org/jira/browse/HIVE-17983
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
>
> In order to be separately installable the standalone metastore needs its own 
> tarballs, startup scripts, etc.  All of the SQL installation and upgrade 
> scripts also need to move from metastore to standalone-metastore.
> I also plan to create Dockerfiles for different database types so that 
> developers can test the SQL installation and upgrade scripts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17331) Path must be used as key type of the pathToAlises

2018-01-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327999#comment-16327999
 ] 

ASF GitHub Bot commented on HIVE-17331:
---

GitHub user dosoft opened a pull request:

https://github.com/apache/hive/pull/292

HIVE-17331: Use Path instead of String as key type of the pathToAliases



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dosoft/hive HIVE-17331

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/292.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #292


commit 661897e01fe48a0f80c3ee4e9168667b7d926ba9
Author: Oleg Danilov 
Date:   2017-08-16T10:34:39Z

HIVE-17331: Use Path instead of String as key type of the pathToAliases




> Path must be used as key type of the pathToAlises
> -
>
> Key: HIVE-17331
> URL: https://issues.apache.org/jira/browse/HIVE-17331
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-17331.patch
>
>
> This code uses String instead of Path as key type of the pathToAliases map, 
> so seems like get(String) always null.
> +*GenMapRedUtils.java*+
> {code:java}
> for (int pos = 0; pos < size; pos++) {
>   String taskTmpDir = taskTmpDirLst.get(pos);
>   TableDesc tt_desc = tt_descLst.get(pos);
>   MapWork mWork = plan.getMapWork();
>   if (mWork.getPathToAliases().get(taskTmpDir) == null) {
> taskTmpDir = taskTmpDir.intern();
> Path taskTmpDirPath = 
> StringInternUtils.internUriStringsInPath(new Path(taskTmpDir));
> mWork.removePathToAlias(taskTmpDirPath);
> mWork.addPathToAlias(taskTmpDirPath, taskTmpDir);
> mWork.addPathToPartitionInfo(taskTmpDirPath, new 
> PartitionDesc(tt_desc, null));
> mWork.getAliasToWork().put(taskTmpDir, topOperators.get(pos));
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18505) Added external hive configuration to prepDb in TxnDbUtil

2018-01-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16333565#comment-16333565
 ] 

ASF GitHub Bot commented on HIVE-18505:
---

Github user chandulal closed the pull request at:

https://github.com/apache/hive/pull/293


> Added external hive configuration to prepDb in TxnDbUtil
> 
>
> Key: HIVE-18505
> URL: https://issues.apache.org/jira/browse/HIVE-18505
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Chandu Kavar
>Assignee: Chandu Kavar
>Priority: Minor
>  Labels: pull-request-available
>
> In Hive Metastore, We have TxtDbUtil.java and it contains few utils required 
> for tests. 
> There is prepDb() method, it is creating connection and execute some system 
> queries in order to prepare db.  While creating connection it's create new 
> HiveConf object and not taking configs from outside.
> TxtDbUtil.java should also contains prepDb method that can accept external 
> hive configs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18505) Added external hive configuration to prepDb in TxnDbUtil

2018-01-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16333411#comment-16333411
 ] 

ASF GitHub Bot commented on HIVE-18505:
---

GitHub user chandulal opened a pull request:

https://github.com/apache/hive/pull/293

HIVE-18505 : Adding prepDb method that accept hive configs from outside



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chandulal/apache-hive master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/293.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #293


commit 08d8de29831c9bea975107d5264894812337803b
Author: Chandu Kavar 
Date:   2018-01-21T06:20:15Z

HIVE-18505 : Adding prepDb method that accept hive configs from outside




> Added external hive configuration to prepDb in TxnDbUtil
> 
>
> Key: HIVE-18505
> URL: https://issues.apache.org/jira/browse/HIVE-18505
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Chandu Kavar
>Assignee: Chandu Kavar
>Priority: Minor
>  Labels: pull-request-available
>
> In Hive Metastore, We have TxtDbUtil.java and it contains few utils required 
> for tests. 
> There is prepDb() method, it is creating connection and execute some system 
> queries in order to prepare db.  While creating connection it's create new 
> HiveConf object and not taking configs from outside.
> TxtDbUtil.java should also contains prepDb method that can accept external 
> hive configs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18505) Added external hive configuration to prepDb in TxnDbUtil

2018-01-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-18505:
--
Labels: pull-request-available  (was: )

> Added external hive configuration to prepDb in TxnDbUtil
> 
>
> Key: HIVE-18505
> URL: https://issues.apache.org/jira/browse/HIVE-18505
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Chandu Kavar
>Assignee: Chandu Kavar
>Priority: Minor
>  Labels: pull-request-available
>
> In Hive Metastore, We have TxtDbUtil.java and it contains few utils required 
> for tests. 
> There is prepDb() method, it is creating connection and execute some system 
> queries in order to prepare db.  While creating connection it's create new 
> HiveConf object and not taking configs from outside.
> TxtDbUtil.java should also contains prepDb method that can accept external 
> hive configs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-15631) Optimize for hive client logs , you can filter the log for each session itself.

2018-01-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334045#comment-16334045
 ] 

ASF GitHub Bot commented on HIVE-15631:
---

GitHub user Tartarus0zm opened a pull request:

https://github.com/apache/hive/pull/295

HIVE-15631

When the Hive client is started, the sessionid is printed from the console.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Tartarus0zm/hive console_sessionid

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/295.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #295


commit 1d0c2d606013498b246f5a30b1972ad8b10cb9a2
Author: Tartarus 
Date:   2018-01-22T08:47:56Z

HIVE-15631
When the Hive client is started, the sessionid is printed from the console.




> Optimize for hive client logs , you can filter the log for each session 
> itself.
> ---
>
> Key: HIVE-15631
> URL: https://issues.apache.org/jira/browse/HIVE-15631
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Clients, Hive
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE_15631.patch, image-2018-01-22-16-37-26-065.png, 
> image-2018-01-22-16-38-20-502.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We have several hadoop cluster, about 15 thousand nodes. Every day we use 
> hive to submit above 100 thousand jobs. 
> So we have a large file of hive logs on every client host every day, but i 
> don not know the logs of my session submitted was which line. 
> So i hope to print the hive.session.id on every line of logs, and then i 
> could use grep to find the logs of my session submitted. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17580) Remove dependency of get_fields_with_environment_context API to serde

2018-01-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16333955#comment-16333955
 ] 

ASF GitHub Bot commented on HIVE-17580:
---

GitHub user vihangk1 opened a pull request:

https://github.com/apache/hive/pull/294

HIVE-17580 Remove dependency of get_fields_with_environment_context API to 
serde

This is an alternative approach to the solve the dependencies with serdes 
for get_fields HMS API. The earlier attempt for HIVE-17580 was very disruptive 
since it attempted to move TypeInfo, and various Type implementations to 
storage-api and also created another module called serde-api.

This patch is a lot more cleaner and less disruptive. Instead of moving 
TypeInfo, it creates similar classes in standalone-metastore. The PR is broken 
into multiple commits with descriptive commit messages.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vihangk1/hive vihangk1_HIVE-17580v2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/294.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #294


commit 708443af3f6356ab73133e271cf00e3418ced8ef
Author: Vihang Karajgaonkar 
Date:   2018-01-21T23:54:04Z

Added MetastoreTypeInfo similar to TypeInfo

This patch adds classes similar to TypeInfo called MetastoreTypeInfo in 
standalone-metastore.
Ideally, we should move TypeInfo to standalone-metastore since they store 
the information
about types. However, moving TypeInfo to standalone-metastore is 
non-trivial effort primarily
because of the below reasons:

1. TypeInfo is annotated as Public API.
This means we can only alter/move these classes in a compatible way.
2. Directly moving these classes is not straight-forward because TypeInfo 
uses PrimitiveEntry
class which internally maps the TypeInfo to Type implementations. Ideally 
metastore should
not use Type implementation which makes it harder to move the TypeInfo 
directly.

However, if we are ready to break compatibility, then TypeInfo broken such 
that it doesn't
use PrimitiveEntry directly. In such a world TypeInfo will store just what 
it needs to store.
Metadata of Types i.e the type category, its qualified name, whether its a 
parameterized type
or not and if yes, how do we validate the parameters.

I am assuming that breaking TypeInfo is a no-go and hence I am copying the 
relevant code
from TypeInfo to Metastore and calling it MetastoreTypeInfo. 
MetastoreTypeInfo and its sub-classes
are used by TypeInfoParser (also copied) to parse the column type strings 
into TypeInfos.

commit 6ec0efa59408c355cfa9aec7fd9dd59d3545aff2
Author: Vihang Karajgaonkar 
Date:   2018-01-03T19:45:32Z

Add avro storeage schema reader

This commit adds a AvroStorageSchemaReader which reads the Avro schema 
files both for external schema and regular avro tables.
Most of the util methods are in AvroSchemaUtils class which has methods 
copied from AvroSerDeUtils. Some of the needed classes like
SchemaResolutionProblem, InstanceCache, SchemaToTypeInfo, TypeInfoToSchema 
are also copied from Hive. The constants defined
in AvroSerde are copied in AvroSerdeConstants. The class 
AvroFieldSchemaGenerator converts the AvroSchema into List of
FieldSchema which is returned by the AvroStorageSchemaReader

Avro schema reader uses  MetastoreTypeInfo and MetastoreTypeInfoParser 
introduced earlier

commit b0f6d1df1ddb627e0f3c1cff3a164c9397337be0
Author: Vihang Karajgaonkar 
Date:   2018-01-04T01:02:40Z

Introduce default storage schema reader

This change introduces a default storage schema reader which copies the 
common code from serdes
initialization method and uses it to parse the column name, type and 
comments from the table
properties. For custom storage schema reades like Avro we will have to add 
more schema readers
as and when required

commit 5ae977a0bf3fd54389671bed86322d3d4652bc20
Author: Vihang Karajgaonkar 
Date:   2018-01-04T19:18:03Z

Integrates the avro schema reader into the DefaultStorageaSchemaReader

commit 2074b16e12c1bdc7ef3781f50e01ab4dd4c71890
Author: Vihang Karajgaonkar 
Date:   2018-01-05T02:38:28Z

Added a test for getFields method in standalone-metastore

commit 4159b5ee9852b41a64489274040e79dbddad54f1
Author: Vihang Karajgaonkar 
Date:   2018-01-22T07:16:13Z

HIVE-18508 : Port schema changes from HIVE-14498 to standalone-metastore




> Remove dependency of get_fields_with_environment_context API to serde
> -
>
> Key:

[jira] [Commented] (HIVE-14660) ArrayIndexOutOfBoundsException on delete

2018-01-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340937#comment-16340937
 ] 

ASF GitHub Bot commented on HIVE-14660:
---

Github user bonnetb closed the pull request at:

https://github.com/apache/hive/pull/100


> ArrayIndexOutOfBoundsException on delete
> 
>
> Key: HIVE-14660
> URL: https://issues.apache.org/jira/browse/HIVE-14660
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, Transactions
>Affects Versions: 1.2.1
>Reporter: Benjamin BONNET
>Assignee: Benjamin BONNET
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14660.1-banch-1.2.patch
>
>
> Hi,
> DELETE on an ACID table may fail on an ArrayIndexOutOfBoundsException.
> That bug occurs at Reduce phase when there are less reducers than the number 
> of the table buckets.
> In order to reproduce, create a simple ACID table :
> {code:sql}
> CREATE TABLE test (`cle` bigint,`valeur` string)
>  PARTITIONED BY (`annee` string)
>  CLUSTERED BY (cle) INTO 5 BUCKETS
>  TBLPROPERTIES ('transactional'='true');
> {code}
> Populate it with lines distributed among all buckets, with random values and 
> a few partitions.
> Force the Reducers to be less than the buckets :
> {code:sql}
> set mapred.reduce.tasks=1;
> {code}
> Then execute a delete that will remove many lines from all the buckets.
> {code:sql}
> DELETE FROM test WHERE valeur<'some_value';
> {code}
> Then you will get an ArrayIndexOutOfBoundsException :
> {code}
> 2016-08-22 21:21:02,500 [FATAL] [TezChild] |tez.ReduceRecordSource|: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"reducesinkkey0":{"transactionid":119,"bucketid":0,"rowid":0}},"value":{"_col0":"4"}}
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 5
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:769)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
> ... 17 more
> {code}
> Adding logs into FileSinkOperator, one sees the operator deals with buckets 
> 0, 1, 2, 3, 4, then 0 again and it fails at line 769 : actually each time you 
> switch bucket, you move forwards in a 5 (number of buckets) elements array. 
> So when you get bucket 0 for the second time, you get out of the array...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-14660) ArrayIndexOutOfBoundsException on delete

2018-01-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-14660:
--
Labels: pull-request-available  (was: )

> ArrayIndexOutOfBoundsException on delete
> 
>
> Key: HIVE-14660
> URL: https://issues.apache.org/jira/browse/HIVE-14660
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, Transactions
>Affects Versions: 1.2.1
>Reporter: Benjamin BONNET
>Assignee: Benjamin BONNET
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14660.1-banch-1.2.patch
>
>
> Hi,
> DELETE on an ACID table may fail on an ArrayIndexOutOfBoundsException.
> That bug occurs at Reduce phase when there are less reducers than the number 
> of the table buckets.
> In order to reproduce, create a simple ACID table :
> {code:sql}
> CREATE TABLE test (`cle` bigint,`valeur` string)
>  PARTITIONED BY (`annee` string)
>  CLUSTERED BY (cle) INTO 5 BUCKETS
>  TBLPROPERTIES ('transactional'='true');
> {code}
> Populate it with lines distributed among all buckets, with random values and 
> a few partitions.
> Force the Reducers to be less than the buckets :
> {code:sql}
> set mapred.reduce.tasks=1;
> {code}
> Then execute a delete that will remove many lines from all the buckets.
> {code:sql}
> DELETE FROM test WHERE valeur<'some_value';
> {code}
> Then you will get an ArrayIndexOutOfBoundsException :
> {code}
> 2016-08-22 21:21:02,500 [FATAL] [TezChild] |tez.ReduceRecordSource|: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"reducesinkkey0":{"transactionid":119,"bucketid":0,"rowid":0}},"value":{"_col0":"4"}}
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 5
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:769)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
> ... 17 more
> {code}
> Adding logs into FileSinkOperator, one sees the operator deals with buckets 
> 0, 1, 2, 3, 4, then 0 again and it fails at line 769 : actually each time you 
> switch bucket, you move forwards in a 5 (number of buckets) elements array. 
> So when you get bucket 0 for the second time, you get out of the array...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-14660) ArrayIndexOutOfBoundsException on delete

2018-01-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340945#comment-16340945
 ] 

ASF GitHub Bot commented on HIVE-14660:
---

GitHub user bonnetb opened a pull request:

https://github.com/apache/hive/pull/299

HIVE-14660 : ArrayIndexOutOfBounds on delete

See https://issues.apache.org/jira/browse/HIVE-14660

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bonnetb/hive HIVE-14660

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/299.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #299


commit 323f4bfa92835921780c057082b440bf54a7f5c8
Author: Benjamin BONNET 
Date:   2016-08-27T20:20:15Z

HIVE-14660 : ArrayIndexOutOfBounds on delete




> ArrayIndexOutOfBoundsException on delete
> 
>
> Key: HIVE-14660
> URL: https://issues.apache.org/jira/browse/HIVE-14660
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, Transactions
>Affects Versions: 1.2.1
>Reporter: Benjamin BONNET
>Assignee: Benjamin BONNET
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14660.1-banch-1.2.patch
>
>
> Hi,
> DELETE on an ACID table may fail on an ArrayIndexOutOfBoundsException.
> That bug occurs at Reduce phase when there are less reducers than the number 
> of the table buckets.
> In order to reproduce, create a simple ACID table :
> {code:sql}
> CREATE TABLE test (`cle` bigint,`valeur` string)
>  PARTITIONED BY (`annee` string)
>  CLUSTERED BY (cle) INTO 5 BUCKETS
>  TBLPROPERTIES ('transactional'='true');
> {code}
> Populate it with lines distributed among all buckets, with random values and 
> a few partitions.
> Force the Reducers to be less than the buckets :
> {code:sql}
> set mapred.reduce.tasks=1;
> {code}
> Then execute a delete that will remove many lines from all the buckets.
> {code:sql}
> DELETE FROM test WHERE valeur<'some_value';
> {code}
> Then you will get an ArrayIndexOutOfBoundsException :
> {code}
> 2016-08-22 21:21:02,500 [FATAL] [TezChild] |tez.ReduceRecordSource|: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"reducesinkkey0":{"transactionid":119,"bucketid":0,"rowid":0}},"value":{"_col0":"4"}}
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 5
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:769)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
> ... 17 more
> {code}
> Adding logs into FileSinkOperator, one sees the operator deals with buckets 
> 0, 1, 2, 3, 4, then 0 again and it fails at

[jira] [Updated] (HIVE-17331) Path must be used as key type of the pathToAlises

2018-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17331:
--
Labels: pull-request-available  (was: )

> Path must be used as key type of the pathToAlises
> -
>
> Key: HIVE-17331
> URL: https://issues.apache.org/jira/browse/HIVE-17331
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-17331.patch
>
>
> This code uses String instead of Path as key type of the pathToAliases map, 
> so seems like get(String) always null.
> +*GenMapRedUtils.java*+
> {code:java}
> for (int pos = 0; pos < size; pos++) {
>   String taskTmpDir = taskTmpDirLst.get(pos);
>   TableDesc tt_desc = tt_descLst.get(pos);
>   MapWork mWork = plan.getMapWork();
>   if (mWork.getPathToAliases().get(taskTmpDir) == null) {
> taskTmpDir = taskTmpDir.intern();
> Path taskTmpDirPath = 
> StringInternUtils.internUriStringsInPath(new Path(taskTmpDir));
> mWork.removePathToAlias(taskTmpDirPath);
> mWork.addPathToAlias(taskTmpDirPath, taskTmpDir);
> mWork.addPathToPartitionInfo(taskTmpDirPath, new 
> PartitionDesc(tt_desc, null));
> mWork.getAliasToWork().put(taskTmpDir, topOperators.get(pos));
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17331) Path must be used as key type of the pathToAlises

2018-01-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327994#comment-16327994
 ] 

ASF GitHub Bot commented on HIVE-17331:
---

Github user dosoft closed the pull request at:

https://github.com/apache/hive/pull/233


> Path must be used as key type of the pathToAlises
> -
>
> Key: HIVE-17331
> URL: https://issues.apache.org/jira/browse/HIVE-17331
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-17331.patch
>
>
> This code uses String instead of Path as key type of the pathToAliases map, 
> so seems like get(String) always null.
> +*GenMapRedUtils.java*+
> {code:java}
> for (int pos = 0; pos < size; pos++) {
>   String taskTmpDir = taskTmpDirLst.get(pos);
>   TableDesc tt_desc = tt_descLst.get(pos);
>   MapWork mWork = plan.getMapWork();
>   if (mWork.getPathToAliases().get(taskTmpDir) == null) {
> taskTmpDir = taskTmpDir.intern();
> Path taskTmpDirPath = 
> StringInternUtils.internUriStringsInPath(new Path(taskTmpDir));
> mWork.removePathToAlias(taskTmpDirPath);
> mWork.addPathToAlias(taskTmpDirPath, taskTmpDir);
> mWork.addPathToPartitionInfo(taskTmpDirPath, new 
> PartitionDesc(tt_desc, null));
> mWork.getAliasToWork().put(taskTmpDir, topOperators.get(pos));
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18192) Introduce WriteID per table rather than using global transaction ID

2018-01-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325734#comment-16325734
 ] 

ASF GitHub Bot commented on HIVE-18192:
---

GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/290

HIVE-18192: Introduce WriteID per table rather than using global 
transaction ID



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-18192

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/290.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #290


commit ced6d749a7e65c42ed31a8af9e38faaf5941251a
Author: Sankar Hariappan 
Date:   2018-01-03T05:47:38Z

HIVE-18192: Introduce WriteID per table rather than using global 
transaction ID




> Introduce WriteID per table rather than using global transaction ID
> ---
>
> Key: HIVE-18192
> URL: https://issues.apache.org/jira/browse/HIVE-18192
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: Sankar Hariappan
>  Labels: ACID, DR, pull-request-available
> Fix For: 3.0.0
>
>
> To support ACID replication, we will be introducing a per table write Id 
> which will replace the transaction id in the primary key for each row in a 
> ACID table. 
> The current primary key is determined via 
> 
> which will move to 
> 
> a persistable map of global txn id -> to table -> write id for that table has 
> to be maintained to now allow Snapshot isolation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18192) Introduce WriteID per table rather than using global transaction ID

2018-01-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-18192:
--
Labels: ACID DR pull-request-available  (was: ACID DR)

> Introduce WriteID per table rather than using global transaction ID
> ---
>
> Key: HIVE-18192
> URL: https://issues.apache.org/jira/browse/HIVE-18192
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: Sankar Hariappan
>  Labels: ACID, DR, pull-request-available
> Fix For: 3.0.0
>
>
> To support ACID replication, we will be introducing a per table write Id 
> which will replace the transaction id in the primary key for each row in a 
> ACID table. 
> The current primary key is determined via 
> 
> which will move to 
> 
> a persistable map of global txn id -> to table -> write id for that table has 
> to be maintained to now allow Snapshot isolation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18478) Drop of temp table creating recycle files at CM path

2018-01-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-18478:
--
Labels: pull-request-available  (was: )

> Drop of temp table creating recycle files at CM path
> 
>
> Key: HIVE-18478
> URL: https://issues.apache.org/jira/browse/HIVE-18478
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0
>
>
> Drop TEMP table operation invokes deleteDir which moves the file to $CMROOT 
> which is not needed as temp tables need not be replicated



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18478) Drop of temp table creating recycle files at CM path

2018-01-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339437#comment-16339437
 ] 

ASF GitHub Bot commented on HIVE-18478:
---

GitHub user maheshk114 opened a pull request:

https://github.com/apache/hive/pull/298

HIVE-18478: Avoiding creation of CM recycle file in case of temp table

In case of drop table, truncate table, load table etc,  the table info is 
deleted which can cause issues during replication. To solve this, the old files 
are stored in the CM directory to be used by replication later. But for 
temporary tables, replication is not done and thus these files creation is not 
required. So extra checks are added to avoid creation of recycle files in case 
of temp tables.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maheshk114/hive HIVE-18478

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/298.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #298


commit 2a5dd87618c1a323a230eafe1906a7ad8b9e7af7
Author: Mahesh Kumar Behera 
Date:   2018-01-25T16:06:27Z

HIVE-18478: Avoiding creation of CM recycle file in case of temp table




> Drop of temp table creating recycle files at CM path
> 
>
> Key: HIVE-18478
> URL: https://issues.apache.org/jira/browse/HIVE-18478
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0
>
>
> Drop TEMP table operation invokes deleteDir which moves the file to $CMROOT 
> which is not needed as temp tables need not be replicated



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-15631) Optimize for hive client logs , you can filter the log for each session itself.

2018-01-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339270#comment-16339270
 ] 

ASF GitHub Bot commented on HIVE-15631:
---

Github user Tartarus0zm closed the pull request at:

https://github.com/apache/hive/pull/295


> Optimize for hive client logs , you can filter the log for each session 
> itself.
> ---
>
> Key: HIVE-15631
> URL: https://issues.apache.org/jira/browse/HIVE-15631
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Clients, Hive
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-15631.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We have several hadoop cluster, about 15 thousand nodes. Every day we use 
> hive to submit above 100 thousand jobs. 
> So we have a large file of hive logs on every client host every day, but i 
> don not know the logs of my session submitted was which line. 
> So i hope to print the hive.session.id on every line of logs, and then i 
> could use grep to find the logs of my session submitted. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18543) Add print sessionid in console

2018-01-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-18543:
--
Labels: pull-request-available  (was: )

> Add print sessionid in console
> --
>
> Key: HIVE-18543
> URL: https://issues.apache.org/jira/browse/HIVE-18543
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Clients
>Affects Versions: 2.3.2
> Environment: CentOS6.5
> Hive-1.2.1
> Hive-2.3.2
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Hive client log file already contains sessionid information, but the console 
> does not have sessionid information, the user can not be associated with the 
> log well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18543) Add print sessionid in console

2018-01-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339276#comment-16339276
 ] 

ASF GitHub Bot commented on HIVE-18543:
---

GitHub user Tartarus0zm opened a pull request:

https://github.com/apache/hive/pull/297

HIVE-18543 Add print sessionid in console



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Tartarus0zm/hive patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/297.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #297


commit 2af02a96de32ede92921a086b04e68e209a27b39
Author: zhangmang 
Date:   2018-01-25T14:23:51Z

Add print sessionid in console




> Add print sessionid in console
> --
>
> Key: HIVE-18543
> URL: https://issues.apache.org/jira/browse/HIVE-18543
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Clients
>Affects Versions: 2.3.2
> Environment: CentOS6.5
> Hive-1.2.1
> Hive-2.3.2
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Hive client log file already contains sessionid information, but the console 
> does not have sessionid information, the user can not be associated with the 
> log well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18467) support whole warehouse dump / load + create/drop database events

2018-01-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-18467:
--
Labels: pull-request-available  (was: )

> support whole warehouse dump / load + create/drop database events
> -
>
> Key: HIVE-18467
> URL: https://issues.apache.org/jira/browse/HIVE-18467
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18467.0.patch
>
>
> A complete hive warehouse might be required to replicate to a DR site for 
> certain use cases and rather than allowing only a database name in the REPL 
> DUMP commands, we should allow dumping of all databases using the "*" option 
> as in 
> _REPL DUMP *_ 
> On the repl  load side there will not be an option to specify the database 
> name when loading from a location used to dump multiple databases, hence only 
> _REPL LOAD FROM [location]_ would be supported when dumping via _REPL DUMP *_
> Additionally, incremental dumps will go through all events across databases 
> in a warehouse and hence CREATE / DROP Database events have to be serialized 
> correctly to allow repl load to create them correctly. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18467) support whole warehouse dump / load + create/drop database events

2018-01-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344709#comment-16344709
 ] 

ASF GitHub Bot commented on HIVE-18467:
---

GitHub user anishek opened a pull request:

https://github.com/apache/hive/pull/300

HIVE-18467: support whole warehouse dump / load + create/drop database 
events



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anishek/hive HIVE-18467

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/300.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #300


commit 59e3afde676cb7666bc202858c2bcdc6958b0861
Author: Anishek Agarwal 
Date:   2018-01-19T08:01:28Z

HIVE-18467: support whole warehouse dump / load + create/drop database 
events




> support whole warehouse dump / load + create/drop database events
> -
>
> Key: HIVE-18467
> URL: https://issues.apache.org/jira/browse/HIVE-18467
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18467.0.patch
>
>
> A complete hive warehouse might be required to replicate to a DR site for 
> certain use cases and rather than allowing only a database name in the REPL 
> DUMP commands, we should allow dumping of all databases using the "*" option 
> as in 
> _REPL DUMP *_ 
> On the repl  load side there will not be an option to specify the database 
> name when loading from a location used to dump multiple databases, hence only 
> _REPL LOAD FROM [location]_ would be supported when dumping via _REPL DUMP *_
> Additionally, incremental dumps will go through all events across databases 
> in a warehouse and hence CREATE / DROP Database events have to be serialized 
> correctly to allow repl load to create them correctly. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18031) Support replication for Alter Database operation.

2018-01-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346321#comment-16346321
 ] 

ASF GitHub Bot commented on HIVE-18031:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/280


> Support replication for Alter Database operation.
> -
>
> Key: HIVE-18031
> URL: https://issues.apache.org/jira/browse/HIVE-18031
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-18031.01.patch, HIVE-18031.02.patch
>
>
> Currently alter database operations to alter the database properties or owner 
> info are not generating any events due to which it is not getting replicated.
> Need to add an event for this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18581) Replication events should use lower case db object names

2018-02-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349878#comment-16349878
 ] 

ASF GitHub Bot commented on HIVE-18581:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/304


> Replication events should use lower case db object names
> 
>
> Key: HIVE-18581
> URL: https://issues.apache.org/jira/browse/HIVE-18581
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-18581.0.patch, HIVE-18581.1.patch
>
>
> events generated by replication should include the database /  tables /  
> partitions / function names in lower case. this will prevent other 
> applications to explicitly do case insensitive match of objects using names. 
> in hive all db object names as specified above are explicitly converted to 
> lower case when comparing between objects of same types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18478) Data files deleted from temp table should not be recycled to CM path

2018-02-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349949#comment-16349949
 ] 

ASF GitHub Bot commented on HIVE-18478:
---

Github user maheshk114 closed the pull request at:

https://github.com/apache/hive/pull/298


> Data files deleted from temp table should not be recycled to CM path
> 
>
> Key: HIVE-18478
> URL: https://issues.apache.org/jira/browse/HIVE-18478
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18478.01.patch, HIVE-18478.02.patch, 
> HIVE-18478.03.patch
>
>
> Drop TEMP table operation invokes deleteDir which moves the file to $CMROOT 
> which is not needed as temp tables need not be replicated



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18543) Add print sessionid in console

2018-01-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346607#comment-16346607
 ] 

ASF GitHub Bot commented on HIVE-18543:
---

GitHub user Tartarus0zm opened a pull request:

https://github.com/apache/hive/pull/303

HIVE-18543

Add print sessionid in console

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Tartarus0zm/hive console_sessionid

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/303.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #303


commit 9a89f7774a113ba11e42ed1183d60c4ea86bb67d
Author: Tartarus 
Date:   2018-01-31T11:00:44Z

HIVE-18543
Add print sessionid in console




> Add print sessionid in console
> --
>
> Key: HIVE-18543
> URL: https://issues.apache.org/jira/browse/HIVE-18543
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Clients
>Affects Versions: 2.3.2
> Environment: CentOS6.5
> Hive-1.2.1
> Hive-2.3.2
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE_18543.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Hive client log file already contains sessionid information, but the console 
> does not have sessionid information, the user can not be associated with the 
> log well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18581) Replication events should use lower case db object names

2018-02-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-18581:
--
Labels: pull-request-available  (was: )

> Replication events should use lower case db object names
> 
>
> Key: HIVE-18581
> URL: https://issues.apache.org/jira/browse/HIVE-18581
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-18581.0.patch
>
>
> events generated by replication should include the database /  tables /  
> partitions / function names in lower case. this will prevent other 
> applications to explicitly do case insensitive match of objects using names. 
> in hive all db object names as specified above are explicitly converted to 
> lower case when comparing between objects of same types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18581) Replication events should use lower case db object names

2018-02-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348164#comment-16348164
 ] 

ASF GitHub Bot commented on HIVE-18581:
---

GitHub user anishek opened a pull request:

https://github.com/apache/hive/pull/304

HIVE-18581: Replication events should use lower case db object names



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anishek/hive HIVE-18581

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/304.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #304


commit 553f0de490d49b77294a70875264819b387e2a45
Author: Anishek Agarwal 
Date:   2018-01-31T10:12:31Z

HIVE-18581: Replication events should use lower case db object names




> Replication events should use lower case db object names
> 
>
> Key: HIVE-18581
> URL: https://issues.apache.org/jira/browse/HIVE-18581
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-18581.0.patch
>
>
> events generated by replication should include the database /  tables /  
> partitions / function names in lower case. this will prevent other 
> applications to explicitly do case insensitive match of objects using names. 
> in hive all db object names as specified above are explicitly converted to 
> lower case when comparing between objects of same types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18679) create/replicate open transaction event

2018-02-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360819#comment-16360819
 ] 

ASF GitHub Bot commented on HIVE-18679:
---

GitHub user maheshk114 opened a pull request:

https://github.com/apache/hive/pull/305

HIVE-18679 : create/replicate open transaction event

EVENT_OPEN_TXN:
Source Warehouse:

Create new event type EVENT_OPEN_TXN with related message format etc.
When any transaction is opened either by auto-commit mode or 
multi-statement mode, need to capture this event.
Repl dump should read this event from EventNotificationTable and dump the 
message.
Target Warehouse:

Repl load should read the event from the dump and get the message.
Open a txn in target warehouse.
Create a map of source txn ID against target txn ID and persist the same in 
metastore. There should be one map per replication policy (DBName.* incase of 
DB level replication, DBName.TableName incase of table level replication)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maheshk114/hive BUG-95520

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/305.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #305


commit 4a855de860ccec6e37e4c16ebbddee575e9ae2f2
Author: Mahesh Kumar Behera 
Date:   2018-02-12T05:07:01Z

HIVE-18679 : create/replicate open transaction event

commit c745a4066b31075004b96200da079b4dd4fd2743
Author: Mahesh Kumar Behera 
Date:   2018-02-12T14:29:54Z

HIVE-18679 : create/replicate open transaction event : rebased with Alan's 
change




> create/replicate open transaction event
> ---
>
> Key: HIVE-18679
> URL: https://issues.apache.org/jira/browse/HIVE-18679
> Project: Hive
>  Issue Type: Bug
>  Components: repl, Transactions
>Affects Versions: 3.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
>
> *EVENT_OPEN_TXN:*
> *Source Warehouse:*
>  - Create new event type EVENT_OPEN_TXN with related message format etc.
>  - When any transaction is opened either by auto-commit mode or 
> multi-statement mode, need to capture this event.
>  - Repl dump should read this event from EventNotificationTable and dump the 
> message.
> *Target Warehouse:*
>  - Repl load should read the event from the dump and get the message.
>  - Open a txn in target warehouse.
>  - Create a map of source txn ID against target txn ID and persist the same 
> in metastore. There should be one map per replication policy (DBName.* incase 
> of DB level replication, DBName.TableName incase of table level replication)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18679) create/replicate open transaction event

2018-02-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-18679:
--
Labels: pull-request-available  (was: )

> create/replicate open transaction event
> ---
>
> Key: HIVE-18679
> URL: https://issues.apache.org/jira/browse/HIVE-18679
> Project: Hive
>  Issue Type: Bug
>  Components: repl, Transactions
>Affects Versions: 3.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
>
> *EVENT_OPEN_TXN:*
> *Source Warehouse:*
>  - Create new event type EVENT_OPEN_TXN with related message format etc.
>  - When any transaction is opened either by auto-commit mode or 
> multi-statement mode, need to capture this event.
>  - Repl dump should read this event from EventNotificationTable and dump the 
> message.
> *Target Warehouse:*
>  - Repl load should read the event from the dump and get the message.
>  - Open a txn in target warehouse.
>  - Create a map of source txn ID against target txn ID and persist the same 
> in metastore. There should be one map per replication policy (DBName.* incase 
> of DB level replication, DBName.TableName incase of table level replication)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

< 1 2 3 4 5 6 7 8 9 10 >

401 - 500 of 34338 matches

Mail list logo