[jira] [Created] (HIVE-12865) Exchange partition does not show inputs field for post/pre execute hooks
Paul Yang created HIVE-12865: Summary: Exchange partition does not show inputs field for post/pre execute hooks Key: HIVE-12865 URL: https://issues.apache.org/jira/browse/HIVE-12865 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Paul Yang The pre/post execute hook interface has fields that indicate which Hive objects were read / written to as a result of running the query. For the exchange partition operation, the read entity field is empty. This is an important issue as the hook interface may be configured to perform critical warehouse operations. See ql/src/test/results/clientpositive/exchange_partition3.q.out {code} --- a/ql/src/test/results/clientpositive/exchange_partition3.q.out +++ b/ql/src/test/results/clientpositive/exchange_partition3.q.out @@ -65,9 +65,17 @@ ds=2013-04-05/hr=2 PREHOOK: query: -- This will exchange both partitions hr=1 and hr=2 ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH TABLE exchange_part_test2 PREHOOK: type: ALTERTABLE_EXCHANGEPARTITION +PREHOOK: Output: default@exchange_part_test1 +PREHOOK: Output: default@exchange_part_test2 POSTHOOK: query: -- This will exchange both partitions hr=1 and hr=2 ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH TABLE exchange_part_test2 POSTHOOK: type: ALTERTABLE_EXCHANGEPARTITION +POSTHOOK: Output: default@exchange_part_test1 +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=1 +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=2 +POSTHOOK: Output: default@exchange_part_test2 +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=1 +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=2 PREHOOK: query: SHOW PARTITIONS exchange_part_test1 PREHOOK: type: SHOWPARTITIONS PREHOOK: Input: default@exchange_part_test1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11554) Exchange partition outputs missing from post execute hooks
Paul Yang created HIVE-11554: Summary: Exchange partition outputs missing from post execute hooks Key: HIVE-11554 URL: https://issues.apache.org/jira/browse/HIVE-11554 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.2.0, 1.0.0, 0.14.0, 0.13.0, 0.12.0 Reporter: Paul Yang The pre/post execute hook interface has fields that indicate which Hive objects were read / written to as a result of running the query. For the exchange partition operation, these fields (ReadEntity and WriteEntity) are empty. This is an important issue as the hook interface may be configured to perform critical warehouse operations. See {noformat} ql/src/test/results/clientpositive/exchange_partition3.q.out {noformat} {noformat} POSTHOOK: query: -- This will exchange both partitions hr=1 and hr=2 ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH TABLE exchange_part_test2 POSTHOOK: type: null {noformat} The post hook should not say null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-3042) thrift jars do not need to be passed to the mappers and reducers
[ https://issues.apache.org/jira/browse/HIVE-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-3042: Attachment: HIVE-3042.1.patch thrift jars do not need to be passed to the mappers and reducers Key: HIVE-3042 URL: https://issues.apache.org/jira/browse/HIVE-3042 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Namit Jain Assignee: Paul Yang Fix For: 0.10.0 Attachments: HIVE-3042.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3042) Thrift classes do not need to be passed to the mappers and reducers
[ https://issues.apache.org/jira/browse/HIVE-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-3042: Summary: Thrift classes do not need to be passed to the mappers and reducers (was: thrift jars do not need to be passed to the mappers and reducers) Thrift classes do not need to be passed to the mappers and reducers --- Key: HIVE-3042 URL: https://issues.apache.org/jira/browse/HIVE-3042 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Namit Jain Assignee: Paul Yang Fix For: 0.10.0 Attachments: HIVE-3042.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3042) Thrift classes do not need to be passed to the mappers and reducers
[ https://issues.apache.org/jira/browse/HIVE-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-3042: Fix Version/s: 0.10.0 Affects Version/s: 0.10.0 Status: Patch Available (was: Open) Thrift classes do not need to be passed to the mappers and reducers --- Key: HIVE-3042 URL: https://issues.apache.org/jira/browse/HIVE-3042 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Namit Jain Assignee: Paul Yang Fix For: 0.10.0 Attachments: HIVE-3042.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3042) Thrift classes do not need to be passed to the mappers and reducers
[ https://issues.apache.org/jira/browse/HIVE-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-3042: Resolution: Duplicate Status: Resolved (was: Patch Available) Thrift classes do not need to be passed to the mappers and reducers --- Key: HIVE-3042 URL: https://issues.apache.org/jira/browse/HIVE-3042 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Namit Jain Assignee: Paul Yang Fix For: 0.10.0 Attachments: HIVE-3042.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3042) Thrift classes do not need to be passed to the mappers and reducers
[ https://issues.apache.org/jira/browse/HIVE-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280584#comment-13280584 ] Paul Yang commented on HIVE-3042: - Duplicate of HIVE-3040 Thrift classes do not need to be passed to the mappers and reducers --- Key: HIVE-3042 URL: https://issues.apache.org/jira/browse/HIVE-3042 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Namit Jain Assignee: Paul Yang Fix For: 0.10.0 Attachments: HIVE-3042.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3000) Potential infinite loop / log spew in ZookeeperHiveLockManager
Paul Yang created HIVE-3000: --- Summary: Potential infinite loop / log spew in ZookeeperHiveLockManager Key: HIVE-3000 URL: https://issues.apache.org/jira/browse/HIVE-3000 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.9.0 Reporter: Paul Yang See ZookeeperHiveLockManger.lock() If Zookeeper is in a bad state, it's possible to get an exception (e.g. org.apache.zookeeper.KeeperException$SessionExpiredException) when we call lockPrimitive(). There is a bug in the exception handler where the loop does not exit because the break in the switch statement gets out the switch, not the do..while loop. Because tryNum was not incremented due to the exception, lockPrimitive() will be called in an infinite loop, as fast as possible. Since the exception is printed for each call, Hive will produce significant log spew. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2246) Dedupe tables' column schemas from partitions in the metastore db
[ https://issues.apache.org/jira/browse/HIVE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083751#comment-13083751 ] Paul Yang commented on HIVE-2246: - There has been some issues identified with this patch. We will be doing some additional testing, but we might rollback so that we don't leave trunk in an unstable state. Dedupe tables' column schemas from partitions in the metastore db - Key: HIVE-2246 URL: https://issues.apache.org/jira/browse/HIVE-2246 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Fix For: 0.8.0 Attachments: HIVE-2246.2.patch, HIVE-2246.3.patch, HIVE-2246.4.patch, HIVE-2246.8.patch Note: this patch proposes a schema change, and is therefore incompatible with the current metastore. We can re-organize the JDO models to reduce space usage to keep the metastore scalable for the future. Currently, partitions are the fastest growing objects in the metastore, and the metastore keeps a separate copy of the columns list for each partition. We can normalize the metastore db by decoupling Columns from Storage Descriptors and not storing duplicate lists of the columns for each partition. An idea is to create an additional level of indirection with a Column Descriptor that has a list of columns. A table has a reference to its latest Column Descriptor (note: a table may have more than one Column Descriptor in the case of schema evolution). Partitions and Indexes can reference the same Column Descriptors as their parent table. Currently, the COLUMNS table in the metastore has roughly (number of partitions + number of tables) * (average number of columns pertable) rows. We can reduce this to (number of tables) * (average number of columns per table) rows, while incurring a small cost proportional to the number of tables to store the Column Descriptors. Please see the latest review board for additional implementation details. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2322) Add ColumnarSerDe to the list of native SerDes
[ https://issues.apache.org/jira/browse/HIVE-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083854#comment-13083854 ] Paul Yang commented on HIVE-2322: - +1. Tested and will commit. Add ColumnarSerDe to the list of native SerDes -- Key: HIVE-2322 URL: https://issues.apache.org/jira/browse/HIVE-2322 Project: Hive Issue Type: Bug Components: Metastore, Serializers/Deserializers Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2322.1.patch, HIVE-2322.2.patch, HIVE-2322.3.patch, HIVE-2322.4.patch, HIVE-2322.5.patch We store metadata about ColumnarSerDes in the metastore, so it should be considered a native SerDe. Then, column information can be retrieved from the metastore instead of from deserialization. Currently, for non-native SerDes, column comments are only shown as from deserializer. Adding ColumnarSerDe to the list of native SerDes will persist column comments. See HIVE-2171 for persisting the column comments of custom SerDes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2322) Add ColumnarSerDe to the list of native SerDes
[ https://issues.apache.org/jira/browse/HIVE-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083865#comment-13083865 ] Paul Yang commented on HIVE-2322: - Committed. Thanks Sohan! Add ColumnarSerDe to the list of native SerDes -- Key: HIVE-2322 URL: https://issues.apache.org/jira/browse/HIVE-2322 Project: Hive Issue Type: Bug Components: Metastore, Serializers/Deserializers Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2322.1.patch, HIVE-2322.2.patch, HIVE-2322.3.patch, HIVE-2322.4.patch, HIVE-2322.5.patch We store metadata about ColumnarSerDes in the metastore, so it should be considered a native SerDe. Then, column information can be retrieved from the metastore instead of from deserialization. Currently, for non-native SerDes, column comments are only shown as from deserializer. Adding ColumnarSerDe to the list of native SerDes will persist column comments. See HIVE-2171 for persisting the column comments of custom SerDes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2246) Dedupe tables' column schemas from partitions in the metastore db
[ https://issues.apache.org/jira/browse/HIVE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081956#comment-13081956 ] Paul Yang commented on HIVE-2246: - +1 - tests passed. Will commit. Dedupe tables' column schemas from partitions in the metastore db - Key: HIVE-2246 URL: https://issues.apache.org/jira/browse/HIVE-2246 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2246.2.patch, HIVE-2246.3.patch, HIVE-2246.4.patch, HIVE-2246.8.patch Note: this patch proposes a schema change, and is therefore incompatible with the current metastore. We can re-organize the JDO models to reduce space usage to keep the metastore scalable for the future. Currently, partitions are the fastest growing objects in the metastore, and the metastore keeps a separate copy of the columns list for each partition. We can normalize the metastore db by decoupling Columns from Storage Descriptors and not storing duplicate lists of the columns for each partition. An idea is to create an additional level of indirection with a Column Descriptor that has a list of columns. A table has a reference to its latest Column Descriptor (note: a table may have more than one Column Descriptor in the case of schema evolution). Partitions and Indexes can reference the same Column Descriptors as their parent table. Currently, the COLUMNS table in the metastore has roughly (number of partitions + number of tables) * (average number of columns pertable) rows. We can reduce this to (number of tables) * (average number of columns per table) rows, while incurring a small cost proportional to the number of tables to store the Column Descriptors. Please see the latest review board for additional implementation details. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-2246) Dedupe tables' column schemas from partitions in the metastore db
[ https://issues.apache.org/jira/browse/HIVE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang resolved HIVE-2246. - Resolution: Fixed Fix Version/s: 0.8.0 Release Note: This makes an incompatible change in the metastore DB table schema from previous versions (0.8). Older metastores created with previous versions of Hive will need to be upgraded with the supplied scripts. Committed. Thanks Sohan! Dedupe tables' column schemas from partitions in the metastore db - Key: HIVE-2246 URL: https://issues.apache.org/jira/browse/HIVE-2246 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Fix For: 0.8.0 Attachments: HIVE-2246.2.patch, HIVE-2246.3.patch, HIVE-2246.4.patch, HIVE-2246.8.patch Note: this patch proposes a schema change, and is therefore incompatible with the current metastore. We can re-organize the JDO models to reduce space usage to keep the metastore scalable for the future. Currently, partitions are the fastest growing objects in the metastore, and the metastore keeps a separate copy of the columns list for each partition. We can normalize the metastore db by decoupling Columns from Storage Descriptors and not storing duplicate lists of the columns for each partition. An idea is to create an additional level of indirection with a Column Descriptor that has a list of columns. A table has a reference to its latest Column Descriptor (note: a table may have more than one Column Descriptor in the case of schema evolution). Partitions and Indexes can reference the same Column Descriptors as their parent table. Currently, the COLUMNS table in the metastore has roughly (number of partitions + number of tables) * (average number of columns pertable) rows. We can reduce this to (number of tables) * (average number of columns per table) rows, while incurring a small cost proportional to the number of tables to store the Column Descriptors. Please see the latest review board for additional implementation details. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2319) Calling alter_table after changing partition comment throws an exception
[ https://issues.apache.org/jira/browse/HIVE-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2319: Resolution: Fixed Fix Version/s: 0.8.0 Status: Resolved (was: Patch Available) Committed. Thanks Sohan! Calling alter_table after changing partition comment throws an exception Key: HIVE-2319 URL: https://issues.apache.org/jira/browse/HIVE-2319 Project: Hive Issue Type: Bug Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Fix For: 0.8.0 Attachments: HIVE-2319.2.patch, HIVE-2319.3.patch, HIVE-2319.4.patch Altering a table's partition key comments raises an InvalidOperationException. The partition key name and type should not be mutable, but the comment should be able to get changed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2322) Add ColumnarSerDe to the list of native SerDes
[ https://issues.apache.org/jira/browse/HIVE-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13079500#comment-13079500 ] Paul Yang commented on HIVE-2322: - Can you regenerate this patch? I'm getting some patch failures. Add ColumnarSerDe to the list of native SerDes -- Key: HIVE-2322 URL: https://issues.apache.org/jira/browse/HIVE-2322 Project: Hive Issue Type: Bug Components: Metastore, Serializers/Deserializers Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2322.1.patch, HIVE-2322.2.patch We store metadata about ColumnarSerDes in the metastore, so it should be considered a native SerDe. Then, column information can be retrieved from the metastore instead of from deserialization. Currently, for non-native SerDes, column comments are only shown as from deserializer. Adding ColumnarSerDe to the list of native SerDes will persist column comments. See HIVE-2171 for persisting the column comments of custom SerDes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2319) Calling alter_table after changing partition comment throws an exception
[ https://issues.apache.org/jira/browse/HIVE-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13079505#comment-13079505 ] Paul Yang commented on HIVE-2319: - +1 Will test and commit Calling alter_table after changing partition comment throws an exception Key: HIVE-2319 URL: https://issues.apache.org/jira/browse/HIVE-2319 Project: Hive Issue Type: Bug Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2319.2.patch, HIVE-2319.3.patch, HIVE-2319.4.patch Altering a table's partition key comments raises an InvalidOperationException. The partition key name and type should not be mutable, but the comment should be able to get changed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2322) Add ColumnarSerDe to the list of native SerDes
[ https://issues.apache.org/jira/browse/HIVE-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13072565#comment-13072565 ] Paul Yang commented on HIVE-2322: - +1 Will test and commit Add ColumnarSerDe to the list of native SerDes -- Key: HIVE-2322 URL: https://issues.apache.org/jira/browse/HIVE-2322 Project: Hive Issue Type: Bug Components: Metastore, Serializers/Deserializers Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2322.1.patch We store metadata about ColumnarSerDes in the metastore, so it should be considered a native SerDe. Then, column information can be retrieved from the metastore instead of from deserialization. Currently, for non-native SerDes, column comments are only shown as from deserializer. Adding ColumnarSerDe to the list of native SerDes will persist column comments. See HIVE-2171 for persisting the column comments of custom SerDes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2226) Add API to retrieve table names by an arbitrary filter, e.g., by owner, retention, parameters, etc.
[ https://issues.apache.org/jira/browse/HIVE-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13071291#comment-13071291 ] Paul Yang commented on HIVE-2226: - Committed. Thanks Sohan! Add API to retrieve table names by an arbitrary filter, e.g., by owner, retention, parameters, etc. --- Key: HIVE-2226 URL: https://issues.apache.org/jira/browse/HIVE-2226 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Fix For: 0.8.0 Attachments: HIVE-2226.1.patch, HIVE-2226.3.patch, HIVE-2226.4.patch Create a function called get_table_names_by_filter that returns a list of table names in a database that match a certain filter. The filter should operate similar to the one HIVE-1609. Initially, you should be able to prune the table list based on owner, retention, or table parameter key/values. The filtering should take place at the JDO level for efficiency/speed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2226) Add API to retrieve table names by an arbitrary filter, e.g., by owner, retention, parameters, etc.
[ https://issues.apache.org/jira/browse/HIVE-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2226: Resolution: Fixed Fix Version/s: 0.8.0 Status: Resolved (was: Patch Available) Add API to retrieve table names by an arbitrary filter, e.g., by owner, retention, parameters, etc. --- Key: HIVE-2226 URL: https://issues.apache.org/jira/browse/HIVE-2226 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Fix For: 0.8.0 Attachments: HIVE-2226.1.patch, HIVE-2226.3.patch, HIVE-2226.4.patch Create a function called get_table_names_by_filter that returns a list of table names in a database that match a certain filter. The filter should operate similar to the one HIVE-1609. Initially, you should be able to prune the table list based on owner, retention, or table parameter key/values. The filtering should take place at the JDO level for efficiency/speed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2309) Incorrect regular expression for extracting task id from filename
[ https://issues.apache.org/jira/browse/HIVE-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2309: Attachment: HIVE-2309.1.patch Incorrect regular expression for extracting task id from filename - Key: HIVE-2309 URL: https://issues.apache.org/jira/browse/HIVE-2309 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.7.1 Reporter: Paul Yang Priority: Minor Attachments: HIVE-2309.1.patch For producing the correct filenames for bucketed tables, there is a method in Utilities.java that extracts out the task id from the filename and replaces it with the bucket number. There is a bug in the regex that is used to extract this value for attempt numbers = 10: {code} re.match(^.*?([0-9]+)(_[0-9])?(\\..*)?$, 'attempt_201107090429_64965_m_001210_10').group(1) '10' re.match(^.*?([0-9]+)(_[0-9])?(\\..*)?$, 'attempt_201107090429_64965_m_001210_9').group(1) '001210' {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2309) Incorrect regular expression for extracting task id from filename
[ https://issues.apache.org/jira/browse/HIVE-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2309: Attachment: HIVE-2309.2.patch Incorrect regular expression for extracting task id from filename - Key: HIVE-2309 URL: https://issues.apache.org/jira/browse/HIVE-2309 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.7.1 Reporter: Paul Yang Assignee: Paul Yang Priority: Minor Attachments: HIVE-2309.1.patch, HIVE-2309.2.patch For producing the correct filenames for bucketed tables, there is a method in Utilities.java that extracts out the task id from the filename and replaces it with the bucket number. There is a bug in the regex that is used to extract this value for attempt numbers = 10: {code} re.match(^.*?([0-9]+)(_[0-9])?(\\..*)?$, 'attempt_201107090429_64965_m_001210_10').group(1) '10' re.match(^.*?([0-9]+)(_[0-9])?(\\..*)?$, 'attempt_201107090429_64965_m_001210_9').group(1) '001210' {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2226) Add API to retrieve table names by an arbitrary filter, e.g., by owner, retention, parameters, etc.
[ https://issues.apache.org/jira/browse/HIVE-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070875#comment-13070875 ] Paul Yang commented on HIVE-2226: - +1 Will test and commit Add API to retrieve table names by an arbitrary filter, e.g., by owner, retention, parameters, etc. --- Key: HIVE-2226 URL: https://issues.apache.org/jira/browse/HIVE-2226 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2226.1.patch, HIVE-2226.3.patch Create a function called get_table_names_by_filter that returns a list of table names in a database that match a certain filter. The filter should operate similar to the one HIVE-1609. Initially, you should be able to prune the table list based on owner, retention, or table parameter key/values. The filtering should take place at the JDO level for efficiency/speed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2301) Throw error when attempting to create a column with the same name as a partition column
Throw error when attempting to create a column with the same name as a partition column --- Key: HIVE-2301 URL: https://issues.apache.org/jira/browse/HIVE-2301 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.8.0 Reporter: Paul Yang Priority: Minor If an alter table is run to rename a column to the same name as a partition column, the alter will succeed. However, subsequent operations on that table will fail. {code} hive create table tmp_pyang_test (key string) partitioned by (ds string); OK Time taken: 4.773 seconds hive alter table tmp_pyang_test replace columns (ds string); OK Time taken: 1.254 seconds hive describe tmp_pyang_test; FAILED: Error in metadata: Partition column name ds conflicts with table columns. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask hive {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2224) Ability to add partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2224: Summary: Ability to add partitions atomically (was: Ability to add_partitions, and atomically) Ability to add partitions atomically Key: HIVE-2224 URL: https://issues.apache.org/jira/browse/HIVE-2224 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-2224.patch I'd like to see an atomic version of the add_partitions() call. Whether this is to be done by config to affect add_partitions() behaviour (not my preference) or just changing add_partitions() default behaviour (my preference, but likely to affect current behaviour, so will need others' input) or by making a new add_partitions_atomic() call depends on discussion. This looks relatively doable to implement (will need a dependent add_partition_core to not do a ms.commit_partition() early, and to cache list of directories created to remove on rollback, and a list of AddPartitionEvent to trigger in one shot later) Thoughts? This also seems like something to implement for allowing HIVE-1805. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2224) Ability to add partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068536#comment-13068536 ] Paul Yang commented on HIVE-2224: - Seems like it was an issue with the machine. But it has been committed - thanks Sushanth! Ability to add partitions atomically Key: HIVE-2224 URL: https://issues.apache.org/jira/browse/HIVE-2224 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-2224.patch I'd like to see an atomic version of the add_partitions() call. Whether this is to be done by config to affect add_partitions() behaviour (not my preference) or just changing add_partitions() default behaviour (my preference, but likely to affect current behaviour, so will need others' input) or by making a new add_partitions_atomic() call depends on discussion. This looks relatively doable to implement (will need a dependent add_partition_core to not do a ms.commit_partition() early, and to cache list of directories created to remove on rollback, and a list of AddPartitionEvent to trigger in one shot later) Thoughts? This also seems like something to implement for allowing HIVE-1805. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2224) Ability to add partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2224: Resolution: Fixed Fix Version/s: 0.8.0 Status: Resolved (was: Patch Available) Ability to add partitions atomically Key: HIVE-2224 URL: https://issues.apache.org/jira/browse/HIVE-2224 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.8.0 Attachments: HIVE-2224.patch I'd like to see an atomic version of the add_partitions() call. Whether this is to be done by config to affect add_partitions() behaviour (not my preference) or just changing add_partitions() default behaviour (my preference, but likely to affect current behaviour, so will need others' input) or by making a new add_partitions_atomic() call depends on discussion. This looks relatively doable to implement (will need a dependent add_partition_core to not do a ms.commit_partition() early, and to cache list of directories created to remove on rollback, and a list of AddPartitionEvent to trigger in one shot later) Thoughts? This also seems like something to implement for allowing HIVE-1805. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2224) Ability to add_partitions, and atomically
[ https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067209#comment-13067209 ] Paul Yang commented on HIVE-2224: - Sorry for the delay, but I've running into some test issues that are likely not caused by your patch. Ability to add_partitions, and atomically - Key: HIVE-2224 URL: https://issues.apache.org/jira/browse/HIVE-2224 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-2224.patch I'd like to see an atomic version of the add_partitions() call. Whether this is to be done by config to affect add_partitions() behaviour (not my preference) or just changing add_partitions() default behaviour (my preference, but likely to affect current behaviour, so will need others' input) or by making a new add_partitions_atomic() call depends on discussion. This looks relatively doable to implement (will need a dependent add_partition_core to not do a ms.commit_partition() early, and to cache list of directories created to remove on rollback, and a list of AddPartitionEvent to trigger in one shot later) Thoughts? This also seems like something to implement for allowing HIVE-1805. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2224) Ability to add_partitions, and atomically
[ https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065625#comment-13065625 ] Paul Yang commented on HIVE-2224: - +1 Will test and commit Ability to add_partitions, and atomically - Key: HIVE-2224 URL: https://issues.apache.org/jira/browse/HIVE-2224 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-2224.patch I'd like to see an atomic version of the add_partitions() call. Whether this is to be done by config to affect add_partitions() behaviour (not my preference) or just changing add_partitions() default behaviour (my preference, but likely to affect current behaviour, so will need others' input) or by making a new add_partitions_atomic() call depends on discussion. This looks relatively doable to implement (will need a dependent add_partition_core to not do a ms.commit_partition() early, and to cache list of directories created to remove on rollback, and a list of AddPartitionEvent to trigger in one shot later) Thoughts? This also seems like something to implement for allowing HIVE-1805. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-2194) Add actions for alter table and alter partition events for metastore event listeners
[ https://issues.apache.org/jira/browse/HIVE-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang resolved HIVE-2194. - Resolution: Fixed Fix Version/s: 0.8.0 Committed. Thanks Sohan! Add actions for alter table and alter partition events for metastore event listeners Key: HIVE-2194 URL: https://issues.apache.org/jira/browse/HIVE-2194 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Fix For: 0.8.0 Attachments: HIVE-2194.1.patch, HIVE-2194.3.patch HIVE-2038 introduced the MetaStoreEventListener abstract class that defines actions to be performed after particular events on a metastore. Improve upon that class by adding events to be performed on alter table and alter partition actions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2219) Make alter table drop partition more efficient
[ https://issues.apache.org/jira/browse/HIVE-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062179#comment-13062179 ] Paul Yang commented on HIVE-2219: - I likely mixed up the RB and JIRA versions - looking at HIVE-2275 now. Make alter table drop partition more efficient Key: HIVE-2219 URL: https://issues.apache.org/jira/browse/HIVE-2219 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Sohan Jain Assignee: Sohan Jain Fix For: 0.8.0 Attachments: HIVE-2219.1.patch, HIVE-2219.2.patch The current function dropTable() that handles dropping multiple partitions is somewhat inefficient. For each partition you want to drop, it loops through each partition in the table to see if the partition exists. This is an _O(mn)_ operation, where _m_ is the number of partitions to drop, and _n_ is the number of partitions in the table. The running time of this function can be improved, which is useful for tables with many partitions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2194) Add actions for alter table and alter partition events for metastore event listeners
[ https://issues.apache.org/jira/browse/HIVE-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062239#comment-13062239 ] Paul Yang commented on HIVE-2194: - Sohan just mentioned that there was a mismatch between the RB and JIRA versions for this one too. This will require another patch. Add actions for alter table and alter partition events for metastore event listeners Key: HIVE-2194 URL: https://issues.apache.org/jira/browse/HIVE-2194 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Fix For: 0.8.0 Attachments: HIVE-2194.1.patch, HIVE-2194.3.patch, HIVE-2194.4.patch HIVE-2038 introduced the MetaStoreEventListener abstract class that defines actions to be performed after particular events on a metastore. Improve upon that class by adding events to be performed on alter table and alter partition actions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2194) Add actions for alter table and alter partition events for metastore event listeners
[ https://issues.apache.org/jira/browse/HIVE-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13060720#comment-13060720 ] Paul Yang commented on HIVE-2194: - +1 Will test and commit. Add actions for alter table and alter partition events for metastore event listeners Key: HIVE-2194 URL: https://issues.apache.org/jira/browse/HIVE-2194 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2194.1.patch, HIVE-2194.3.patch HIVE-2038 introduced the MetaStoreEventListener abstract class that defines actions to be performed after particular events on a metastore. Improve upon that class by adding events to be performed on alter table and alter partition actions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2219) Make alter table drop partition more efficient
[ https://issues.apache.org/jira/browse/HIVE-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13056180#comment-13056180 ] Paul Yang commented on HIVE-2219: - +1 Will test and commit. Make alter table drop partition more efficient Key: HIVE-2219 URL: https://issues.apache.org/jira/browse/HIVE-2219 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2219.1.patch The current function dropTable() that handles dropping multiple partitions is somewhat inefficient. For each partition you want to drop, it loops through each partition in the table to see if the partition exists. This is an _O(mn)_ operation, where _m_ is the number of partitions to drop, and _n_ is the number of partitions in the table. The running time of this function can be improved, which is useful for tables with many partitions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2213) Optimize partial specification metastore functions
[ https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2213: Summary: Optimize partial specification metastore functions (was: Optimize get_partition_names_ps()) Optimize partial specification metastore functions -- Key: HIVE-2213 URL: https://issues.apache.org/jira/browse/HIVE-2213 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2213.1.patch, HIVE-2213.3.patch If a table has a large number of partitions, get_partition_names_ps() make take a long time to execute, because we get all of the partition names from the database. This is not very memory efficient, and the operation can be pushed down to the JDO layer without getting all of the names first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2213) Optimize partial specification metastore functions
[ https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2213: Resolution: Fixed Fix Version/s: 0.8.0 Status: Resolved (was: Patch Available) Committed. Thanks Sohan! Optimize partial specification metastore functions -- Key: HIVE-2213 URL: https://issues.apache.org/jira/browse/HIVE-2213 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Fix For: 0.8.0 Attachments: HIVE-2213.1.patch, HIVE-2213.3.patch If a table has a large number of partitions, get_partition_names_ps() make take a long time to execute, because we get all of the partition names from the database. This is not very memory efficient, and the operation can be pushed down to the JDO layer without getting all of the names first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2213) Optimize get_partition_names_ps()
[ https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13051419#comment-13051419 ] Paul Yang commented on HIVE-2213: - If get_partitions_ps_with_auth() was not correct before, then we should fix the method to produce the correct behavior. Ideally, it should have been done in a separate JIRA, but it should be okay to include in this one. +1 looks good though, will test and commit. Optimize get_partition_names_ps() - Key: HIVE-2213 URL: https://issues.apache.org/jira/browse/HIVE-2213 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2213.1.patch, HIVE-2213.3.patch If a table has a large number of partitions, get_partition_names_ps() make take a long time to execute, because we get all of the partition names from the database. This is not very memory efficient, and the operation can be pushed down to the JDO layer without getting all of the names first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2213) Optimize get_partition_names_ps()
[ https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050586#comment-13050586 ] Paul Yang commented on HIVE-2213: - Looks good, but can you do a minor update to fix lines longer than 100 chars? Optimize get_partition_names_ps() - Key: HIVE-2213 URL: https://issues.apache.org/jira/browse/HIVE-2213 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2213.1.patch If a table has a large number of partitions, get_partition_names_ps() make take a long time to execute, because we get all of the partition names from the database. This is not very memory efficient, and the operation can be pushed down to the JDO layer without getting all of the names first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2219) Make alter table drop partition more efficient
[ https://issues.apache.org/jira/browse/HIVE-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050847#comment-13050847 ] Paul Yang commented on HIVE-2219: - Can you make a reviewboard instance? Make alter table drop partition more efficient Key: HIVE-2219 URL: https://issues.apache.org/jira/browse/HIVE-2219 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Sohan Jain Assignee: Sohan Jain Attachments: HIVE-2219.1.patch The current function dropTable() that handles dropping multiple partitions is somewhat inefficient. For each partition you want to drop, it loops through each partition in the table to see if the partition exists. This is an _O(mn)_ operation, where _m_ is the number of partitions to drop, and _n_ is the number of partitions in the table. The running time of this function can be improved, which is useful for tables with many partitions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2147) Add api to send / receive message to metastore
[ https://issues.apache.org/jira/browse/HIVE-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047468#comment-13047468 ] Paul Yang commented on HIVE-2147: - I agree with John's suggestion for PARTITION_EVENTS. For this event table, when will rows be dropped? Also, for when partitions are represented using a string, we've followed the convention that they are called partition names. Can we use that for MPartitionSet? Since MPartitionSet.partVals is a string, we should make it indexed, much like partitionName for the PARTITION table. Add api to send / receive message to metastore -- Key: HIVE-2147 URL: https://issues.apache.org/jira/browse/HIVE-2147 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.8.0 Attachments: api-without-thrift.patch, hive_2147-2.patch This is follow-up work on HIVE-2038. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1595) job name for alter table T archive partition P is not correct
[ https://issues.apache.org/jira/browse/HIVE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1595: Resolution: Fixed Status: Resolved (was: Patch Available) Committed. Thanks Sohan! job name for alter table T archive partition P is not correct - Key: HIVE-1595 URL: https://issues.apache.org/jira/browse/HIVE-1595 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Sohan Jain Attachments: Hive-1595.1.patch, Hive-1595.2.patch For some internal runs, I saw the job name as hadoop-0.20.1-tools.jar, which makes it difficult to identify -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1595) job name for alter table T archive partition P is not correct
[ https://issues.apache.org/jira/browse/HIVE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045134#comment-13045134 ] Paul Yang commented on HIVE-1595: - +1 Will test and commit. job name for alter table T archive partition P is not correct - Key: HIVE-1595 URL: https://issues.apache.org/jira/browse/HIVE-1595 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Sohan Jain Attachments: Hive-1595.1.patch, Hive-1595.2.patch For some internal runs, I saw the job name as hadoop-0.20.1-tools.jar, which makes it difficult to identify -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2029) MetaStore ConnectionURL updates need to trigger creation of Default DB if it doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044032#comment-13044032 ] Paul Yang commented on HIVE-2029: - Can you elaborate on how this retry feature works in datanucleus 3.0? The case that could be handled with the URL hook is as follows - a db host goes down. A failover is performed and a replica on a different host is promoted to be the new master. Using the hook, the client is able to re-execute the query on the new host and the Hive query succeeds without failure. Would it be possible to implement something similar in datanucleus 3.0? MetaStore ConnectionURL updates need to trigger creation of Default DB if it doesn't exist -- Key: HIVE-2029 URL: https://issues.apache.org/jira/browse/HIVE-2029 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.7.0 Reporter: Carl Steinbach Attachments: hive_2029.patch HIVE-1219 defined the JDOConnectionURLHook plugin, and integrated this feature into HiveMetaStore. On MetaStore operation failures, this plugin is used to update the metastore ConnectionURL configuration property. Currently this update triggers the reinitialization of the underlying JDO PersistenceManager, but it does not trigger checks to see if the default database exists, nor will it create the default database if it does not exist. It needs to do both. This ticket also covers removing the 'hive.metastore.force.reload.conf' property from HiveConf and HiveMetaStore. This property should not have been added in the first place since its sole purpose is to facilitate testing of the JDOConnectionURLHook mechanism by unnaturally forcing reinitialization of the PersistenceManager. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1595) job name for alter table T archive partition P is not correct
[ https://issues.apache.org/jira/browse/HIVE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1595: Status: Open (was: Patch Available) Looks good, but can you remove the changes to readme? job name for alter table T archive partition P is not correct - Key: HIVE-1595 URL: https://issues.apache.org/jira/browse/HIVE-1595 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Sohan Jain Attachments: Hive-1595.1.patch For some internal runs, I saw the job name as hadoop-0.20.1-tools.jar, which makes it difficult to identify -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-1595) job name for alter table T archive partition P is not correct
[ https://issues.apache.org/jira/browse/HIVE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang reassigned HIVE-1595: --- Assignee: Sohan Jain (was: Paul Yang) job name for alter table T archive partition P is not correct - Key: HIVE-1595 URL: https://issues.apache.org/jira/browse/HIVE-1595 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Sohan Jain For some internal runs, I saw the job name as hadoop-0.20.1-tools.jar, which makes it difficult to identify -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2153) Stats JDBC LIKE queries should escape '_' and '%'
[ https://issues.apache.org/jira/browse/HIVE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2153: Resolution: Fixed Fix Version/s: 0.8.0 Status: Resolved (was: Patch Available) Committed. Thanks Ning! Stats JDBC LIKE queries should escape '_' and '%' - Key: HIVE-2153 URL: https://issues.apache.org/jira/browse/HIVE-2153 Project: Hive Issue Type: Bug Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.8.0 Attachments: HIVE-2153.2.patch, HIVE-2153.patch DELETE /* Hive stats aggregation: org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsAggregator */ FROM PARTITION_STAT_TBL WHERE ID LIKE 'hdfs://dfsnode:9000/tmp/hive-root/hive_2011-05-09_04-30-28_586_4184342157898880918/-ext-1/ds=2011-05-08/table_name=dim_page_to_user_suggest_assoc/%' It is a prefix query but the '_' in the ID column should be escaped. The same applies to '%' if they appear in ID. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2153) Stats JDBC LIKE queries should escape '_' and '%'
[ https://issues.apache.org/jira/browse/HIVE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030981#comment-13030981 ] Paul Yang commented on HIVE-2153: - +1 Will test and commit Stats JDBC LIKE queries should escape '_' and '%' - Key: HIVE-2153 URL: https://issues.apache.org/jira/browse/HIVE-2153 Project: Hive Issue Type: Bug Reporter: Ning Zhang Assignee: Ning Zhang Attachments: HIVE-2153.2.patch, HIVE-2153.patch DELETE /* Hive stats aggregation: org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsAggregator */ FROM PARTITION_STAT_TBL WHERE ID LIKE 'hdfs://dfsnode:9000/tmp/hive-root/hive_2011-05-09_04-30-28_586_4184342157898880918/-ext-1/ds=2011-05-08/table_name=dim_page_to_user_suggest_assoc/%' It is a prefix query but the '_' in the ID column should be escaped. The same applies to '%' if they appear in ID. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-2028) Performance instruments for client side execution
[ https://issues.apache.org/jira/browse/HIVE-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2028: Resolution: Fixed Fix Version/s: 0.8.0 Status: Resolved (was: Patch Available) Committed. Thanks Ning! Performance instruments for client side execution - Key: HIVE-2028 URL: https://issues.apache.org/jira/browse/HIVE-2028 Project: Hive Issue Type: Improvement Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.8.0 Attachments: HIVE-2028.2.patch, HIVE-2028.3.patch, HIVE-2028.patch Hive client side execution could sometimes takes a long time. This task is to instrument the client side code to measure the time spent in the most likely expensive components. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-2061) Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility
[ https://issues.apache.org/jira/browse/HIVE-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008363#comment-13008363 ] Paul Yang commented on HIVE-2061: - Looks good, will test and commit. Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility -- Key: HIVE-2061 URL: https://issues.apache.org/jira/browse/HIVE-2061 Project: Hive Issue Type: Bug Reporter: Ning Zhang Assignee: Ning Zhang Priority: Minor Attachments: HIVE-2061.patch We have seen a use case where in the user's script, it run 'add jar hive_contrib.jar'. Since Hive has moved the jar file to be hive-contrib-{version}.jar, it introduced backward incompatibility. If we as the user to change the script and when Hive upgrade version again, the user need to change the script again. Creating a symlink seems to be the best solution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-2061) Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility
[ https://issues.apache.org/jira/browse/HIVE-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008585#comment-13008585 ] Paul Yang commented on HIVE-2061: - Committed. Thanks Ning! Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility -- Key: HIVE-2061 URL: https://issues.apache.org/jira/browse/HIVE-2061 Project: Hive Issue Type: Bug Reporter: Ning Zhang Assignee: Ning Zhang Priority: Minor Attachments: HIVE-2061.patch We have seen a use case where in the user's script, it run 'add jar hive_contrib.jar'. Since Hive has moved the jar file to be hive-contrib-{version}.jar, it introduced backward incompatibility. If we as the user to change the script and when Hive upgrade version again, the user need to change the script again. Creating a symlink seems to be the best solution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-2028) Performance instruments for client side execution
[ https://issues.apache.org/jira/browse/HIVE-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007625#comment-13007625 ] Paul Yang commented on HIVE-2028: - In PerfLogEnd(): {code} sb.append(/); {code} Shouldn't this be a since this is a close tag? Performance instruments for client side execution - Key: HIVE-2028 URL: https://issues.apache.org/jira/browse/HIVE-2028 Project: Hive Issue Type: Improvement Reporter: Ning Zhang Assignee: Ning Zhang Attachments: HIVE-2028.2.patch, HIVE-2028.patch Hive client side execution could sometimes takes a long time. This task is to instrument the client side code to measure the time spent in the most likely expensive components. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-2028) Performance instruments for client side execution
[ https://issues.apache.org/jira/browse/HIVE-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007759#comment-13007759 ] Paul Yang commented on HIVE-2028: - +1 Will test and commit. Performance instruments for client side execution - Key: HIVE-2028 URL: https://issues.apache.org/jira/browse/HIVE-2028 Project: Hive Issue Type: Improvement Reporter: Ning Zhang Assignee: Ning Zhang Attachments: HIVE-2028.2.patch, HIVE-2028.3.patch, HIVE-2028.patch Hive client side execution could sometimes takes a long time. This task is to instrument the client side code to measure the time spent in the most likely expensive components. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-1918) Add export/import facilities to the hive system
[ https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1918: Resolution: Fixed Fix Version/s: 0.8.0 Status: Resolved (was: Patch Available) Committed. Thanks Krishna! Add export/import facilities to the hive system --- Key: HIVE-1918 URL: https://issues.apache.org/jira/browse/HIVE-1918 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Krishna Kumar Assignee: Krishna Kumar Fix For: 0.8.0 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.5.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf This is an enhancement request to add export/import features to hive. With this language extension, the user can export the data of the table - which may be located in different hdfs locations in case of a partitioned table - as well as the metadata of the table into a specified output location. This output location can then be moved over to another different hadoop/hive instance and imported there. This should work independent of the source and target metastore dbms used; for instance, between derby and mysql. For partitioned tables, the ability to export/import a subset of the partition must be supported. Howl will add more features on top of this: The ability to create/use the exported data even in the absence of hive, using MR or Pig. Please see http://wiki.apache.org/pig/Howl/HowlImportExport for these details. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system
[ https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13006582#comment-13006582 ] Paul Yang commented on HIVE-1918: - +1 Looks good, will test and commit Add export/import facilities to the hive system --- Key: HIVE-1918 URL: https://issues.apache.org/jira/browse/HIVE-1918 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Krishna Kumar Assignee: Krishna Kumar Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.5.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf This is an enhancement request to add export/import features to hive. With this language extension, the user can export the data of the table - which may be located in different hdfs locations in case of a partitioned table - as well as the metadata of the table into a specified output location. This output location can then be moved over to another different hadoop/hive instance and imported there. This should work independent of the source and target metastore dbms used; for instance, between derby and mysql. For partitioned tables, the ability to export/import a subset of the partition must be supported. Howl will add more features on top of this: The ability to create/use the exported data even in the absence of hive, using MR or Pig. Please see http://wiki.apache.org/pig/Howl/HowlImportExport for these details. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-2022) Making JDO thread-safe by default
[ https://issues.apache.org/jira/browse/HIVE-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002184#comment-13002184 ] Paul Yang commented on HIVE-2022: - Apologies for the build break - Ning and I are looking into fixing some issues with my build environment. Making JDO thread-safe by default - Key: HIVE-2022 URL: https://issues.apache.org/jira/browse/HIVE-2022 Project: Hive Issue Type: Bug Components: Configuration, Metastore Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.8.0 Attachments: HIVE-2022.patch If there are multiple thread accessing metastore concurrently, there are cases that JDO threw exceptions because of concurrent access of HashMap inside JDO. Setting javax.jdo.option.Multithreaded to true solves this issue. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-2022) Making JDO thread-safe by default
[ https://issues.apache.org/jira/browse/HIVE-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2022: Resolution: Fixed Fix Version/s: 0.8.0 Status: Resolved (was: Patch Available) Committed. Thanks Ning! Making JDO thread-safe by default - Key: HIVE-2022 URL: https://issues.apache.org/jira/browse/HIVE-2022 Project: Hive Issue Type: Bug Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.8.0 Attachments: HIVE-2022.patch If there are multiple thread accessing metastore concurrently, there are cases that JDO threw exceptions because of concurrent access of HashMap inside JDO. Setting javax.jdo.option.Multithreaded to true solves this issue. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-2022) Making JDO thread-safe by default
[ https://issues.apache.org/jira/browse/HIVE-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001666#comment-13001666 ] Paul Yang commented on HIVE-2022: - @Mac - sounds like a good idea. I'll backport to 0.7. Making JDO thread-safe by default - Key: HIVE-2022 URL: https://issues.apache.org/jira/browse/HIVE-2022 Project: Hive Issue Type: Bug Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.8.0 Attachments: HIVE-2022.patch If there are multiple thread accessing metastore concurrently, there are cases that JDO threw exceptions because of concurrent access of HashMap inside JDO. Setting javax.jdo.option.Multithreaded to true solves this issue. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-1941) support explicit view partitioning
[ https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001281#comment-13001281 ] Paul Yang commented on HIVE-1941: - +1 tests passed support explicit view partitioning -- Key: HIVE-1941 URL: https://issues.apache.org/jira/browse/HIVE-1941 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Fix For: 0.8.0 Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, HIVE-1941.4.patch, HIVE-1941.5.patch Allow creation of a view with an explicit partitioning definition, and support ALTER VIEW ADD/DROP PARTITION for instantiating partitions. For more information, see http://wiki.apache.org/hadoop/Hive/PartitionedViews -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-1941) support explicit view partitioning
[ https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1941: Resolution: Fixed Status: Resolved (was: Patch Available) Committed. Thanks John! support explicit view partitioning -- Key: HIVE-1941 URL: https://issues.apache.org/jira/browse/HIVE-1941 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Fix For: 0.8.0 Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, HIVE-1941.4.patch, HIVE-1941.5.patch Allow creation of a view with an explicit partitioning definition, and support ALTER VIEW ADD/DROP PARTITION for instantiating partitions. For more information, see http://wiki.apache.org/hadoop/Hive/PartitionedViews -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-2022) Making JDO thread-safe by default
[ https://issues.apache.org/jira/browse/HIVE-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001283#comment-13001283 ] Paul Yang commented on HIVE-2022: - +1 Will commit once tests pass. Making JDO thread-safe by default - Key: HIVE-2022 URL: https://issues.apache.org/jira/browse/HIVE-2022 Project: Hive Issue Type: Bug Reporter: Ning Zhang Assignee: Ning Zhang Attachments: HIVE-2022.patch If there are multiple thread accessing metastore concurrently, there are cases that JDO threw exceptions because of concurrent access of HashMap inside JDO. Setting javax.jdo.option.Multithreaded to true solves this issue. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Assigned: (HIVE-1920) DESCRIBE with comments is difficult to read
[ https://issues.apache.org/jira/browse/HIVE-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang reassigned HIVE-1920: --- Assignee: (was: Paul Yang) DESCRIBE with comments is difficult to read --- Key: HIVE-1920 URL: https://issues.apache.org/jira/browse/HIVE-1920 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.7.0 Reporter: Paul Yang Priority: Minor Attachments: HIVE-1920.1.nocomment.patch When DESCRIBE is run, comments for columns are displayed next to the column type. A problem with this is that if the comment contains line breaks, it is difficult to differentiate the columns from the comments and is difficult to read. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-2002) Expand exceptions caught for metastore operations
[ https://issues.apache.org/jira/browse/HIVE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2002: Assignee: Paul Yang Status: Patch Available (was: Open) Expand exceptions caught for metastore operations - Key: HIVE-2002 URL: https://issues.apache.org/jira/browse/HIVE-2002 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0 Reporter: Paul Yang Assignee: Paul Yang Priority: Minor Attachments: HIVE-2002.1.patch Currently, HiveMetaStore.executeWithRetry() catches two classes of exceptions and retries the metastore call when such exceptions occur. However, it does not catch some exceptions that could benefit from a retry: {code} Failed with exception javax.jdo.JDOException: Couldnt obtain a new sequence (unique id) : The MySQL server is running with the --read-only option so it cannot execute this statement NestedThrowables: java.sql.SQLException: The MySQL server is running with the --read-only option so it cannot execute this statement FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask {code} In this case, the MySQL server could be temporarily in a read-only mode, and a later DB call may succeed. To handle these situations, this JIRA proposes to expand the class of exceptions caught for retries. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-2002) Expand exceptions caught for metastore operations
[ https://issues.apache.org/jira/browse/HIVE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2002: Attachment: HIVE-2002.1.patch Expand exceptions caught for metastore operations - Key: HIVE-2002 URL: https://issues.apache.org/jira/browse/HIVE-2002 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0 Reporter: Paul Yang Priority: Minor Attachments: HIVE-2002.1.patch Currently, HiveMetaStore.executeWithRetry() catches two classes of exceptions and retries the metastore call when such exceptions occur. However, it does not catch some exceptions that could benefit from a retry: {code} Failed with exception javax.jdo.JDOException: Couldnt obtain a new sequence (unique id) : The MySQL server is running with the --read-only option so it cannot execute this statement NestedThrowables: java.sql.SQLException: The MySQL server is running with the --read-only option so it cannot execute this statement FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask {code} In this case, the MySQL server could be temporarily in a read-only mode, and a later DB call may succeed. To handle these situations, this JIRA proposes to expand the class of exceptions caught for retries. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-1941) support explicit view partitioning
[ https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12998832#comment-12998832 ] Paul Yang commented on HIVE-1941: - Patch looks good once we have the aforementioned changes. support explicit view partitioning -- Key: HIVE-1941 URL: https://issues.apache.org/jira/browse/HIVE-1941 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, HIVE-1941.4.patch Allow creation of a view with an explicit partitioning definition, and support ALTER VIEW ADD/DROP PARTITION for instantiating partitions. For more information, see http://wiki.apache.org/hadoop/Hive/PartitionedViews -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-1941) support explicit view partitioning
[ https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1941: Status: Open (was: Patch Available) support explicit view partitioning -- Key: HIVE-1941 URL: https://issues.apache.org/jira/browse/HIVE-1941 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, HIVE-1941.4.patch Allow creation of a view with an explicit partitioning definition, and support ALTER VIEW ADD/DROP PARTITION for instantiating partitions. For more information, see http://wiki.apache.org/hadoop/Hive/PartitionedViews -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system
[ https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12998129#comment-12998129 ] Paul Yang commented on HIVE-1918: - Made a couple of comments on reviewboard. Add export/import facilities to the hive system --- Key: HIVE-1918 URL: https://issues.apache.org/jira/browse/HIVE-1918 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Krishna Kumar Assignee: Krishna Kumar Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf This is an enhancement request to add export/import features to hive. With this language extension, the user can export the data of the table - which may be located in different hdfs locations in case of a partitioned table - as well as the metadata of the table into a specified output location. This output location can then be moved over to another different hadoop/hive instance and imported there. This should work independent of the source and target metastore dbms used; for instance, between derby and mysql. For partitioned tables, the ability to export/import a subset of the partition must be supported. Howl will add more features on top of this: The ability to create/use the exported data even in the absence of hive, using MR or Pig. Please see http://wiki.apache.org/pig/Howl/HowlImportExport for these details. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-2001) Add inputs and outputs to authorization ddls.
[ https://issues.apache.org/jira/browse/HIVE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2001: Description: When permissions are changed for a table/partition, the respective object should be present in the read/write entities for hooks to act on. Add inputs and outputs to authorization ddls. - Key: HIVE-2001 URL: https://issues.apache.org/jira/browse/HIVE-2001 Project: Hive Issue Type: Bug Reporter: He Yongqiang Assignee: He Yongqiang Attachments: hive-2001.patch When permissions are changed for a table/partition, the respective object should be present in the read/write entities for hooks to act on. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-2001) Add inputs and outputs to authorization ddls.
[ https://issues.apache.org/jira/browse/HIVE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12998135#comment-12998135 ] Paul Yang commented on HIVE-2001: - +1 will test and commit Add inputs and outputs to authorization ddls. - Key: HIVE-2001 URL: https://issues.apache.org/jira/browse/HIVE-2001 Project: Hive Issue Type: Bug Reporter: He Yongqiang Assignee: He Yongqiang Attachments: hive-2001.patch When permissions are changed for a table/partition, the respective object should be present in the read/write entities for hooks to act on. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-2001) Add inputs and outputs to authorization DDL commands
[ https://issues.apache.org/jira/browse/HIVE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2001: Component/s: Query Processor Affects Version/s: 0.8.0 Summary: Add inputs and outputs to authorization DDL commands (was: Add inputs and outputs to authorization ddls.) Add inputs and outputs to authorization DDL commands Key: HIVE-2001 URL: https://issues.apache.org/jira/browse/HIVE-2001 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.8.0 Reporter: He Yongqiang Assignee: He Yongqiang Attachments: hive-2001.patch When permissions are changed for a table/partition, the respective object should be present in the read/write entities for hooks to act on. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (HIVE-2002) Expand exceptions caught for metastore operations
Expand exceptions caught for metastore operations - Key: HIVE-2002 URL: https://issues.apache.org/jira/browse/HIVE-2002 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0 Reporter: Paul Yang Priority: Minor Currently, HiveMetaStore.executeWithRetry() catches two classes of exceptions and retries the metastore call when such exceptions occur. However, it does not catch some exceptions that could benefit from a retry: {code} Failed with exception javax.jdo.JDOException: Couldnt obtain a new sequence (unique id) : The MySQL server is running with the --read-only option so it cannot execute this statement NestedThrowables: java.sql.SQLException: The MySQL server is running with the --read-only option so it cannot execute this statement FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask {code} In this case, the MySQL server could be temporarily in a read-only mode, and a later DB call may succeed. To handle these situations, this JIRA proposes to expand the class of exceptions caught for retries. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-1941) support explicit view partitioning
[ https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12996593#comment-12996593 ] Paul Yang commented on HIVE-1941: - @John - Yes, that's what I meant. I'll take a look at the whole patch as well. support explicit view partitioning -- Key: HIVE-1941 URL: https://issues.apache.org/jira/browse/HIVE-1941 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, HIVE-1941.4.patch Allow creation of a view with an explicit partitioning definition, and support ALTER VIEW ADD/DROP PARTITION for instantiating partitions. For more information, see http://wiki.apache.org/hadoop/Hive/PartitionedViews -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-1941) support explicit view partitioning
[ https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995666#comment-12995666 ] Paul Yang commented on HIVE-1941: - It looks like it's possible with the current thrift add_partition() method to create a partition for a view with a non-null SD/location. Can we put in a check to guard against this case? Other than that, it looks good from the metastore/replication side. support explicit view partitioning -- Key: HIVE-1941 URL: https://issues.apache.org/jira/browse/HIVE-1941 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, HIVE-1941.4.patch Allow creation of a view with an explicit partitioning definition, and support ALTER VIEW ADD/DROP PARTITION for instantiating partitions. For more information, see http://wiki.apache.org/hadoop/Hive/PartitionedViews -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-1941) support explicit view partitioning
[ https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995668#comment-12995668 ] Paul Yang commented on HIVE-1941: - Similarly, we should handle the case when calling append_partition() on a view. support explicit view partitioning -- Key: HIVE-1941 URL: https://issues.apache.org/jira/browse/HIVE-1941 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, HIVE-1941.4.patch Allow creation of a view with an explicit partitioning definition, and support ALTER VIEW ADD/DROP PARTITION for instantiating partitions. For more information, see http://wiki.apache.org/hadoop/Hive/PartitionedViews -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (HIVE-1995) Mismatched open/commit transaction calls when using get_partition()
Mismatched open/commit transaction calls when using get_partition() --- Key: HIVE-1995 URL: https://issues.apache.org/jira/browse/HIVE-1995 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.7.0 Reporter: Paul Yang Priority: Minor Nested executeWithRetry() calls caused by using HiveMetaStore.get_partition() can result in mis-matched open/commit calls. Fixes the same issue as described in HIVE-1760. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-1995) Mismatched open/commit transaction calls when using get_partition()
[ https://issues.apache.org/jira/browse/HIVE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1995: Attachment: HIVE-1995.1.patch Mismatched open/commit transaction calls when using get_partition() --- Key: HIVE-1995 URL: https://issues.apache.org/jira/browse/HIVE-1995 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.7.0 Reporter: Paul Yang Priority: Minor Attachments: HIVE-1995.1.patch Nested executeWithRetry() calls caused by using HiveMetaStore.get_partition() can result in mis-matched open/commit calls. Fixes the same issue as described in HIVE-1760. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-1788) Add more calls to the metastore thrift interface
[ https://issues.apache.org/jira/browse/HIVE-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992848#comment-12992848 ] Paul Yang commented on HIVE-1788: - Instead of returning a list of Table objects, could we return a list of the matching table names? Then, the user would be responsible for getting the necessary table objects. Also, have you tried measuring the speed of the call where there are many (1000+) tables? It might be very slow, similar to how get_partitions() performs poorly compared to get_partition_names() Also, with the current approach, won't the offsets not be consistent if new tables are created in between calls? Add more calls to the metastore thrift interface Key: HIVE-1788 URL: https://issues.apache.org/jira/browse/HIVE-1788 Project: Hive Issue Type: New Feature Reporter: Ashish Thusoo Assignee: Ashish Thusoo Attachments: HIVE-1788.txt For administrative purposes the following calls to the metastore thrift interface would be very useful: 1. Get the table metadata for all the tables owned by a particular users 2. Ability to iterate over this set of tables 3. Ability to change a particular key value property of the table -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-1818) Call frequency and duration metrics for HiveMetaStore via jmx
[ https://issues.apache.org/jira/browse/HIVE-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992247#comment-12992247 ] Paul Yang commented on HIVE-1818: - +1 tests passed Call frequency and duration metrics for HiveMetaStore via jmx - Key: HIVE-1818 URL: https://issues.apache.org/jira/browse/HIVE-1818 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Minor Fix For: 0.7.0 Attachments: HIVE-1818-vs-1054860.patch, HIVE-1818-vs-1063088.patch, HIVE-1818.patch As recently brought up in the hive-dev mailing list, it'd be useful if the HiveMetaStore had some sort of instrumentation capability so as to measure frequency of calls to various calls on the HiveMetaStore and the duration of time spent in these calls. There are already incrementCounter() and logStartFunction() / logStartTableFunction() ,etc calls in HiveMetaStore, and they could be refactored/repurposed to make calls that expose JMX MBeans as well. Or, a Metrics subsystem could be introduced which made calls to incrementCounter()/etc as a refactor. It might also be possible to specify a -D parameter that the Metrics subsystem could use to determine whether or not to be enabled, and if so, on to what port. And once we have the capability to instrument and expose MBeans, it might also be possible for other subsystems to also adopt and use this system. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system
[ https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12991282#comment-12991282 ] Paul Yang commented on HIVE-1918: - Looking at it as well.. Add export/import facilities to the hive system --- Key: HIVE-1918 URL: https://issues.apache.org/jira/browse/HIVE-1918 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Krishna Kumar Assignee: Krishna Kumar Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf This is an enhancement request to add export/import features to hive. With this language extension, the user can export the data of the table - which may be located in different hdfs locations in case of a partitioned table - as well as the metadata of the table into a specified output location. This output location can then be moved over to another different hadoop/hive instance and imported there. This should work independent of the source and target metastore dbms used; for instance, between derby and mysql. For partitioned tables, the ability to export/import a subset of the partition must be supported. Howl will add more features on top of this: The ability to create/use the exported data even in the absence of hive, using MR or Pig. Please see http://wiki.apache.org/pig/Howl/HowlImportExport for these details. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-1818) Call frequency and duration metrics for HiveMetaStore via jmx
[ https://issues.apache.org/jira/browse/HIVE-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12990749#comment-12990749 ] Paul Yang commented on HIVE-1818: - In case it was missed - posted 1 additional suggestion on reviewboard. Call frequency and duration metrics for HiveMetaStore via jmx - Key: HIVE-1818 URL: https://issues.apache.org/jira/browse/HIVE-1818 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Minor Fix For: 0.7.0 Attachments: HIVE-1818-vs-1054860.patch, HIVE-1818-vs-1063088.patch, HIVE-1818.patch As recently brought up in the hive-dev mailing list, it'd be useful if the HiveMetaStore had some sort of instrumentation capability so as to measure frequency of calls to various calls on the HiveMetaStore and the duration of time spent in these calls. There are already incrementCounter() and logStartFunction() / logStartTableFunction() ,etc calls in HiveMetaStore, and they could be refactored/repurposed to make calls that expose JMX MBeans as well. Or, a Metrics subsystem could be introduced which made calls to incrementCounter()/etc as a refactor. It might also be possible to specify a -D parameter that the Metrics subsystem could use to determine whether or not to be enabled, and if so, on to what port. And once we have the capability to instrument and expose MBeans, it might also be possible for other subsystems to also adopt and use this system. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-1934) alter table rename messes the location
[ https://issues.apache.org/jira/browse/HIVE-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1934: Status: Patch Available (was: Open) alter table rename messes the location -- Key: HIVE-1934 URL: https://issues.apache.org/jira/browse/HIVE-1934 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Paul Yang Priority: Blocker Fix For: 0.7.0 Attachments: HIVE-1934.1.patch create table tmptmp(a string) partitioned by (b string); alter table tmptmp add partition (b=1:2:3); alter table tmptmp rename to tmptmp_test; The location for tmptmp_test partition (b=1:2:3) is unescaped due to rename, and hence it cannot be dropped. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-1934) alter table rename messes the location
[ https://issues.apache.org/jira/browse/HIVE-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1934: Attachment: HIVE-1934.1.patch alter table rename messes the location -- Key: HIVE-1934 URL: https://issues.apache.org/jira/browse/HIVE-1934 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Paul Yang Priority: Blocker Fix For: 0.7.0 Attachments: HIVE-1934.1.patch create table tmptmp(a string) partitioned by (b string); alter table tmptmp add partition (b=1:2:3); alter table tmptmp rename to tmptmp_test; The location for tmptmp_test partition (b=1:2:3) is unescaped due to rename, and hence it cannot be dropped. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Assigned: (HIVE-1862) Revive partition filtering in the Hive MetaStore
[ https://issues.apache.org/jira/browse/HIVE-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang reassigned HIVE-1862: --- Assignee: Mac Yang Revive partition filtering in the Hive MetaStore Key: HIVE-1862 URL: https://issues.apache.org/jira/browse/HIVE-1862 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.7.0 Reporter: Devaraj Das Assignee: Mac Yang Fix For: 0.7.0 Attachments: HIVE-1862.1.patch.txt, HIVE-1862.2.patch.txt, invoke_runqry.sh, qry, qry-sch.Z, runqry HIVE-1853 downgraded the JDO version. This makes the feature of partition filtering in the metastore unusable. This jira is to keep track of the lost feature and discussing approaches to bring it back. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1920) DESCRIBE with comments is difficult to read
[ https://issues.apache.org/jira/browse/HIVE-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1920: Component/s: CLI DESCRIBE with comments is difficult to read --- Key: HIVE-1920 URL: https://issues.apache.org/jira/browse/HIVE-1920 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.7.0 Reporter: Paul Yang Assignee: Paul Yang When DESCRIBE is run, comments for columns are displayed next to the column type. A problem with this is that if the comment contains line breaks, it is difficult to differentiate the columns from the comments and is difficult to read. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1920) DESCRIBE with comments is difficult to read
DESCRIBE with comments is difficult to read --- Key: HIVE-1920 URL: https://issues.apache.org/jira/browse/HIVE-1920 Project: Hive Issue Type: Bug Affects Versions: 0.7.0 Reporter: Paul Yang Assignee: Paul Yang When DESCRIBE is run, comments for columns are displayed next to the column type. A problem with this is that if the comment contains line breaks, it is difficult to differentiate the columns from the comments and is difficult to read. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1862) Revive partition filtering in the Hive MetaStore
[ https://issues.apache.org/jira/browse/HIVE-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983908#action_12983908 ] Paul Yang commented on HIVE-1862: - +1 looks good. Will commit if tests pass. Revive partition filtering in the Hive MetaStore Key: HIVE-1862 URL: https://issues.apache.org/jira/browse/HIVE-1862 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.7.0 Reporter: Devaraj Das Assignee: Mac Yang Fix For: 0.7.0 Attachments: HIVE-1862.1.patch.txt, HIVE-1862.2.patch.txt, HIVE-1862.3.patch.txt, invoke_runqry.sh, qry, qry-sch.Z, runqry HIVE-1853 downgraded the JDO version. This makes the feature of partition filtering in the metastore unusable. This jira is to keep track of the lost feature and discussing approaches to bring it back. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1921) Better error message when a non-essential job fails
Better error message when a non-essential job fails --- Key: HIVE-1921 URL: https://issues.apache.org/jira/browse/HIVE-1921 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.7.0 Reporter: Paul Yang Assignee: Paul Yang Priority: Minor To determine whether a join can be converted into a map-join, a task is launched to determine memory requirements. If the task fails, then a normal join must be performed. This is not an error but the user sees a message like: {code} ... 2011-01-19 02:48:51 Processing rows:180 Hashtable size: 179 Memory usage: 818546352 rate: 0.789 2011-01-19 02:48:57 Processing rows:190 Hashtable size: 189 Memory usage: 861746352 rate: 0.831 2011-01-19 02:49:05 Processing rows:200 Hashtable size: 199 Memory usage: 904921384 rate: 0.873 2011-01-19 02:49:12 Processing rows:210 Hashtable size: 209 Memory usage: 952382416 rate: 0.918 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapredLocalTask ATTEMPT: Execute BackupTask: org.apache.hadoop.hive.ql.exec.MapRedTask Launching Job 2 out of 2 ... {code} The wording makes it seem as if something went wrong, which is not necessarily the case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1862) Revive partition filtering in the Hive MetaStore
[ https://issues.apache.org/jira/browse/HIVE-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1862: Resolution: Fixed Status: Resolved (was: Patch Available) Committed. Thanks Mac! Cool use of string manipulation. Hopefully, we'll find a workaround for those escaped partition names soon.. Revive partition filtering in the Hive MetaStore Key: HIVE-1862 URL: https://issues.apache.org/jira/browse/HIVE-1862 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.7.0 Reporter: Devaraj Das Assignee: Mac Yang Fix For: 0.7.0 Attachments: HIVE-1862.1.patch.txt, HIVE-1862.2.patch.txt, HIVE-1862.3.patch.txt, invoke_runqry.sh, qry, qry-sch.Z, runqry HIVE-1853 downgraded the JDO version. This makes the feature of partition filtering in the metastore unusable. This jira is to keep track of the lost feature and discussing approaches to bring it back. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1862) Revive partition filtering in the Hive MetaStore
[ https://issues.apache.org/jira/browse/HIVE-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1862: Status: Open (was: Patch Available) Revive partition filtering in the Hive MetaStore Key: HIVE-1862 URL: https://issues.apache.org/jira/browse/HIVE-1862 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.7.0 Reporter: Devaraj Das Fix For: 0.7.0 Attachments: HIVE-1862.1.patch.txt, HIVE-1862.2.patch.txt, invoke_runqry.sh, qry, qry-sch.Z, runqry HIVE-1853 downgraded the JDO version. This makes the feature of partition filtering in the metastore unusable. This jira is to keep track of the lost feature and discussing approaches to bring it back. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1818) Call frequency and duration metrics for HiveMetaStore via jmx
[ https://issues.apache.org/jira/browse/HIVE-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1818: Status: Open (was: Patch Available) Call frequency and duration metrics for HiveMetaStore via jmx - Key: HIVE-1818 URL: https://issues.apache.org/jira/browse/HIVE-1818 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Minor Fix For: 0.7.0 Attachments: HIVE-1818-vs-1054860.patch, HIVE-1818.patch As recently brought up in the hive-dev mailing list, it'd be useful if the HiveMetaStore had some sort of instrumentation capability so as to measure frequency of calls to various calls on the HiveMetaStore and the duration of time spent in these calls. There are already incrementCounter() and logStartFunction() / logStartTableFunction() ,etc calls in HiveMetaStore, and they could be refactored/repurposed to make calls that expose JMX MBeans as well. Or, a Metrics subsystem could be introduced which made calls to incrementCounter()/etc as a refactor. It might also be possible to specify a -D parameter that the Metrics subsystem could use to determine whether or not to be enabled, and if so, on to what port. And once we have the capability to instrument and expose MBeans, it might also be possible for other subsystems to also adopt and use this system. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1818) Call frequency and duration metrics for HiveMetaStore via jmx
[ https://issues.apache.org/jira/browse/HIVE-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974785#action_12974785 ] Paul Yang commented on HIVE-1818: - Taking a look Call frequency and duration metrics for HiveMetaStore via jmx - Key: HIVE-1818 URL: https://issues.apache.org/jira/browse/HIVE-1818 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Sushanth Sowmyan Priority: Minor Attachments: HIVE-1818.patch As recently brought up in the hive-dev mailing list, it'd be useful if the HiveMetaStore had some sort of instrumentation capability so as to measure frequency of calls to various calls on the HiveMetaStore and the duration of time spent in these calls. There are already incrementCounter() and logStartFunction() / logStartTableFunction() ,etc calls in HiveMetaStore, and they could be refactored/repurposed to make calls that expose JMX MBeans as well. Or, a Metrics subsystem could be introduced which made calls to incrementCounter()/etc as a refactor. It might also be possible to specify a -D parameter that the Metrics subsystem could use to determine whether or not to be enabled, and if so, on to what port. And once we have the capability to instrument and expose MBeans, it might also be possible for other subsystems to also adopt and use this system. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1818) Call frequency and duration metrics for HiveMetaStore via jmx
[ https://issues.apache.org/jira/browse/HIVE-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang reassigned HIVE-1818: --- Assignee: Sushanth Sowmyan Call frequency and duration metrics for HiveMetaStore via jmx - Key: HIVE-1818 URL: https://issues.apache.org/jira/browse/HIVE-1818 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Minor Attachments: HIVE-1818.patch As recently brought up in the hive-dev mailing list, it'd be useful if the HiveMetaStore had some sort of instrumentation capability so as to measure frequency of calls to various calls on the HiveMetaStore and the duration of time spent in these calls. There are already incrementCounter() and logStartFunction() / logStartTableFunction() ,etc calls in HiveMetaStore, and they could be refactored/repurposed to make calls that expose JMX MBeans as well. Or, a Metrics subsystem could be introduced which made calls to incrementCounter()/etc as a refactor. It might also be possible to specify a -D parameter that the Metrics subsystem could use to determine whether or not to be enabled, and if so, on to what port. And once we have the capability to instrument and expose MBeans, it might also be possible for other subsystems to also adopt and use this system. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1818) Call frequency and duration metrics for HiveMetaStore via jmx
[ https://issues.apache.org/jira/browse/HIVE-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974786#action_12974786 ] Paul Yang commented on HIVE-1818: - Sushanth, can you regenerate this patch against the current trunk? Call frequency and duration metrics for HiveMetaStore via jmx - Key: HIVE-1818 URL: https://issues.apache.org/jira/browse/HIVE-1818 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Minor Attachments: HIVE-1818.patch As recently brought up in the hive-dev mailing list, it'd be useful if the HiveMetaStore had some sort of instrumentation capability so as to measure frequency of calls to various calls on the HiveMetaStore and the duration of time spent in these calls. There are already incrementCounter() and logStartFunction() / logStartTableFunction() ,etc calls in HiveMetaStore, and they could be refactored/repurposed to make calls that expose JMX MBeans as well. Or, a Metrics subsystem could be introduced which made calls to incrementCounter()/etc as a refactor. It might also be possible to specify a -D parameter that the Metrics subsystem could use to determine whether or not to be enabled, and if so, on to what port. And once we have the capability to instrument and expose MBeans, it might also be possible for other subsystems to also adopt and use this system. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1818) Call frequency and duration metrics for HiveMetaStore via jmx
[ https://issues.apache.org/jira/browse/HIVE-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1818: Status: Open (was: Patch Available) Call frequency and duration metrics for HiveMetaStore via jmx - Key: HIVE-1818 URL: https://issues.apache.org/jira/browse/HIVE-1818 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Minor Attachments: HIVE-1818.patch As recently brought up in the hive-dev mailing list, it'd be useful if the HiveMetaStore had some sort of instrumentation capability so as to measure frequency of calls to various calls on the HiveMetaStore and the duration of time spent in these calls. There are already incrementCounter() and logStartFunction() / logStartTableFunction() ,etc calls in HiveMetaStore, and they could be refactored/repurposed to make calls that expose JMX MBeans as well. Or, a Metrics subsystem could be introduced which made calls to incrementCounter()/etc as a refactor. It might also be possible to specify a -D parameter that the Metrics subsystem could use to determine whether or not to be enabled, and if so, on to what port. And once we have the capability to instrument and expose MBeans, it might also be possible for other subsystems to also adopt and use this system. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1857) mixed case tablename on lefthand side of LATERAL VIEW results in query failing with confusing error message
[ https://issues.apache.org/jira/browse/HIVE-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1857: Resolution: Fixed Status: Resolved (was: Patch Available) Committed. Thanks John! mixed case tablename on lefthand side of LATERAL VIEW results in query failing with confusing error message --- Key: HIVE-1857 URL: https://issues.apache.org/jira/browse/HIVE-1857 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Fix For: 0.7.0 Attachments: HIVE-1857.1.patch For the modified query below in lateral_view.q, the exception org.apache.hadoop.hive.ql.parse.SemanticException: line 3:7 Invalid Table Alias or Column Reference myCol is thrown. The query should succeed. SELECT myCol from tmp_PYANG_lv LATERAL VIEW explode(array(1,2,3)) myTab as myCol limit 3; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1854) Temporarily disable metastore tests for listPartitionsByFilter()
[ https://issues.apache.org/jira/browse/HIVE-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang reassigned HIVE-1854: --- Assignee: Paul Yang Temporarily disable metastore tests for listPartitionsByFilter() Key: HIVE-1854 URL: https://issues.apache.org/jira/browse/HIVE-1854 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.7.0 Reporter: Paul Yang Assignee: Paul Yang Priority: Minor Attachments: HIVE-1854.1.patch After the JDO downgrade in HIVE-1853, the tests for the disabled function listPartitionByFilter() should be disabled as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1854) Temporarily disable metastore tests for listPartitionsByFilter()
[ https://issues.apache.org/jira/browse/HIVE-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1854: Attachment: HIVE-1854.1.patch Temporarily disable metastore tests for listPartitionsByFilter() Key: HIVE-1854 URL: https://issues.apache.org/jira/browse/HIVE-1854 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.7.0 Reporter: Paul Yang Priority: Minor Attachments: HIVE-1854.1.patch After the JDO downgrade in HIVE-1853, the tests for the disabled function listPartitionByFilter() should be disabled as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1854) Temporarily disable metastore tests for listPartitionsByFilter()
[ https://issues.apache.org/jira/browse/HIVE-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1854: Status: Patch Available (was: Open) Temporarily disable metastore tests for listPartitionsByFilter() Key: HIVE-1854 URL: https://issues.apache.org/jira/browse/HIVE-1854 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.7.0 Reporter: Paul Yang Assignee: Paul Yang Priority: Minor Attachments: HIVE-1854.1.patch After the JDO downgrade in HIVE-1853, the tests for the disabled function listPartitionByFilter() should be disabled as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1853) downgrade JDO version
[ https://issues.apache.org/jira/browse/HIVE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1853: Attachment: HIVE-1853.1.patch downgrade JDO version - Key: HIVE-1853 URL: https://issues.apache.org/jira/browse/HIVE-1853 Project: Hive Issue Type: Bug Affects Versions: 0.7.0 Reporter: Namit Jain Assignee: Paul Yang Attachments: HIVE-1853.1.patch After HIVE-1609, we are seeing some table not found errors intermittently. We have a test case where 5 processes are concurrently issueing the same query - explain extended insert .. select from T and once in a while, we get a error T not found - When we revert back the JDO version, the error is gone. We can investigate later to find the JDO bug, but for now this is a show-stopper for facebook, and needs to be reverted back immediately. This also means, that the filters will not be pushed to mysql. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1853) downgrade JDO version
[ https://issues.apache.org/jira/browse/HIVE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1853: Affects Version/s: 0.7.0 Status: Patch Available (was: Open) downgrade JDO version - Key: HIVE-1853 URL: https://issues.apache.org/jira/browse/HIVE-1853 Project: Hive Issue Type: Bug Affects Versions: 0.7.0 Reporter: Namit Jain Assignee: Paul Yang Attachments: HIVE-1853.1.patch After HIVE-1609, we are seeing some table not found errors intermittently. We have a test case where 5 processes are concurrently issueing the same query - explain extended insert .. select from T and once in a while, we get a error T not found - When we revert back the JDO version, the error is gone. We can investigate later to find the JDO bug, but for now this is a show-stopper for facebook, and needs to be reverted back immediately. This also means, that the filters will not be pushed to mysql. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1853) downgrade JDO version
[ https://issues.apache.org/jira/browse/HIVE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1853: Attachment: HIVE-1853.2.patch Regenerated downgrade JDO version - Key: HIVE-1853 URL: https://issues.apache.org/jira/browse/HIVE-1853 Project: Hive Issue Type: Bug Affects Versions: 0.7.0 Reporter: Namit Jain Assignee: Paul Yang Attachments: HIVE-1853.1.patch, HIVE-1853.2.patch After HIVE-1609, we are seeing some table not found errors intermittently. We have a test case where 5 processes are concurrently issueing the same query - explain extended insert .. select from T and once in a while, we get a error T not found - When we revert back the JDO version, the error is gone. We can investigate later to find the JDO bug, but for now this is a show-stopper for facebook, and needs to be reverted back immediately. This also means, that the filters will not be pushed to mysql. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1853) downgrade JDO version
[ https://issues.apache.org/jira/browse/HIVE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972230#action_12972230 ] Paul Yang commented on HIVE-1853: - Actually, I made the listPartitionsByFilter() throw a runtime exception until this issue is resolved because the feature is supposed to require a newer version of JDOQL to function properly, as stated in HIVE-1609. I'll confirm this again with Ajay. downgrade JDO version - Key: HIVE-1853 URL: https://issues.apache.org/jira/browse/HIVE-1853 Project: Hive Issue Type: Bug Affects Versions: 0.7.0 Reporter: Namit Jain Assignee: Paul Yang Fix For: 0.7.0 Attachments: HIVE-1853.1.patch, HIVE-1853.2.patch After HIVE-1609, we are seeing some table not found errors intermittently. We have a test case where 5 processes are concurrently issueing the same query - explain extended insert .. select from T and once in a while, we get a error T not found - When we revert back the JDO version, the error is gone. We can investigate later to find the JDO bug, but for now this is a show-stopper for facebook, and needs to be reverted back immediately. This also means, that the filters will not be pushed to mysql. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.