[jira] Commented: (HIVE-1779) Implement GenericUDF str_to_map
[ https://issues.apache.org/jira/browse/HIVE-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930668#action_12930668 ] Namit Jain commented on HIVE-1779: -- +1 Looks good - There were very minor changes. show_functions.q.out change and added desc function to the test you added. I made the changes and am committing them. Will upload the new patch Implement GenericUDF str_to_map --- Key: HIVE-1779 URL: https://issues.apache.org/jira/browse/HIVE-1779 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-1779.1.patch People need way to load their data to map. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1779) Implement GenericUDF str_to_map
[ https://issues.apache.org/jira/browse/HIVE-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1779: - Resolution: Fixed Fix Version/s: 0.7.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Siying Implement GenericUDF str_to_map --- Key: HIVE-1779 URL: https://issues.apache.org/jira/browse/HIVE-1779 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Fix For: 0.7.0 Attachments: HIVE-1779.1.patch People need way to load their data to map. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1743) Group-by to determine equals of Keys in reverse order
[ https://issues.apache.org/jira/browse/HIVE-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930671#action_12930671 ] Namit Jain commented on HIVE-1743: -- +1 running tests Group-by to determine equals of Keys in reverse order - Key: HIVE-1743 URL: https://issues.apache.org/jira/browse/HIVE-1743 Project: Hive Issue Type: Improvement Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-1743.1.patch When processing group-by, in reduce side, keys are ordered. Comparing equality of two keys can be more efficient in reverse order. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1501) when generating reentrant INSERT for index rebuild, quote identifiers using backticks
[ https://issues.apache.org/jira/browse/HIVE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930677#action_12930677 ] John Sichi commented on HIVE-1501: -- +1 on the latest. Will commit when tests pass. when generating reentrant INSERT for index rebuild, quote identifiers using backticks - Key: HIVE-1501 URL: https://issues.apache.org/jira/browse/HIVE-1501 Project: Hive Issue Type: Bug Components: Indexing Affects Versions: 0.7.0 Reporter: John Sichi Assignee: Skye Berghel Fix For: 0.7.0 Attachments: 1501.patch, 1501_new_tests.patch, 1501_with_tests.patch, HIVE-1501.4.patch, HIVE-1501.5.patch, HIVE-1501.6.patch Yongqiang, you mentioned that you weren't able to do this due to SORT BY not accepting them. The SORT BY is gone now as of HIVE-1494 (and SORT BY needs to be fixed anyway). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1781) outputs not populated for dynamic partitions at compile time
outputs not populated for dynamic partitions at compile time Key: HIVE-1781 URL: https://issues.apache.org/jira/browse/HIVE-1781 Project: Hive Issue Type: Bug Reporter: Namit Jain OSTHOOK: query: create table tstsrcpart like srcpart POSTHOOK: type: CREATETABLE POSTHOOK: Output: defa...@tstsrcpart PREHOOK: query: from srcpart insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds = '2008-04-08' PREHOOK: type: QUERY PREHOOK: Input: defa...@srcpart@ds=2008-04-08/hr=11 PREHOOK: Input: defa...@srcpart@ds=2008-04-08/hr=12 POSTHOOK: query: from srcpart As is evident from above, the outputs are not populated at all at compile time. This may create a problem for many components that depend on outputs: locking, authorization etc. However, the exact set of outputs may be needed for some other components (for example. the internal deployment in Facebook has a replication hook which is used for replication which needs the exact set of outputs). It may be a good idea to extend WriteEntity to include a flag which indicates whether the output is complete or not, and then the hook can look at that flag if needed -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: jmx metrics for metastore server
I'm not sure what % of people would be using a Metastore server as opposed to just using a local metastore - I know we have a usecase where we do want a centralized metastore server because it'd be the common metastore across multiple projects/processes and we'd like to add in monitoring instrumentation for it. That said, if we were to use a -D parameter to determine the port, and apply jmx instrumentation in the Metastore server, even local clients should be able to make use of that change. Also, once we have a way to add in instrumentation, I suspect we should be able to add in instrumentation from a query level as well if needed. -Sushanth
[jira] Commented: (HIVE-78) Authorization infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930820#action_12930820 ] Pradeep Kamath commented on HIVE-78: Will there be a way to turn off authorization (through some configuration property) OR is there a way to allow all access OR is authorization implementation going to be pluggable? Since howl is looking at a different authorization model based on dfs permissions, one of these options would be needed for howl. Authorization infrastructure for Hive - Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hive Issue Type: New Feature Components: Metastore, Query Processor, Server Infrastructure Reporter: Ashish Thusoo Assignee: He Yongqiang Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, HIVE-78.1.nothrift.patch, HIVE-78.1.thrift.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-78) Authorization infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930822#action_12930822 ] He Yongqiang commented on HIVE-78: -- Will there be a way to turn off authorization (through some configuration property) Yes. is authorization implementation going to be pluggable? Yes. This is exactly what we wanted. I think Howl can just plug in its own authorization implementation. Authorization infrastructure for Hive - Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hive Issue Type: New Feature Components: Metastore, Query Processor, Server Infrastructure Reporter: Ashish Thusoo Assignee: He Yongqiang Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, HIVE-78.1.nothrift.patch, HIVE-78.1.thrift.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1501) when generating reentrant INSERT for index rebuild, quote identifiers using backticks
[ https://issues.apache.org/jira/browse/HIVE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-1501: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Skye! when generating reentrant INSERT for index rebuild, quote identifiers using backticks - Key: HIVE-1501 URL: https://issues.apache.org/jira/browse/HIVE-1501 Project: Hive Issue Type: Bug Components: Indexing Affects Versions: 0.7.0 Reporter: John Sichi Assignee: Skye Berghel Fix For: 0.7.0 Attachments: 1501.patch, 1501_new_tests.patch, 1501_with_tests.patch, HIVE-1501.4.patch, HIVE-1501.5.patch, HIVE-1501.6.patch Yongqiang, you mentioned that you weren't able to do this due to SORT BY not accepting them. The SORT BY is gone now as of HIVE-1494 (and SORT BY needs to be fixed anyway). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1754) Remove JDBM component from Map Join
[ https://issues.apache.org/jira/browse/HIVE-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Tang updated HIVE-1754: - Attachment: hive-1754_7.patch Change the code style according to the reviewer comments Remove JDBM component from Map Join --- Key: HIVE-1754 URL: https://issues.apache.org/jira/browse/HIVE-1754 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.6.0, 0.7.0 Reporter: Liyin Tang Assignee: Liyin Tang Fix For: 0.7.0 Attachments: Hive-1754.patch, Hive-1754_2.patch, Hive-1754_3.patch, hive-1754_4.patch, hive-1754_5.patch, hive-1754_7.patch Right now, JDBM is the major performance bottleneck of performance. With the growth of the small table, the PUT and GET operation will take most of execution time. Map Join is designed to load the data of small table into memory. If the data is too large to hold in memory, then there is no need to use the map join strategy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1780) Typo in hive-default.xml
[ https://issues.apache.org/jira/browse/HIVE-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YoungWoo Kim updated HIVE-1780: --- Status: Patch Available (was: Open) patch for HIVE-1780 Typo in hive-default.xml Key: HIVE-1780 URL: https://issues.apache.org/jira/browse/HIVE-1780 Project: Hive Issue Type: Bug Components: Configuration Reporter: YoungWoo Kim Priority: Trivial Fix For: 0.7.0 Attachments: HIVE-1780.patch 'CombineHiveInputFormat' is spelt incorrectly in the hive-default.xml: It should be 'CombineHiveInputFormat' instead of 'CombinedHiveInputFormat'. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1734) Implement map_keys() and map_values() UDFs
[ https://issues.apache.org/jira/browse/HIVE-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eldon Stegall updated HIVE-1734: Affects Version/s: 0.6.0 Status: Patch Available (was: Open) First pass at a patch. Should be easily massagable into your source tree. Implement map_keys() and map_values() UDFs -- Key: HIVE-1734 URL: https://issues.apache.org/jira/browse/HIVE-1734 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.6.0 Reporter: Carl Steinbach Assignee: Carl Steinbach Implement the following UDFs: array map_keys(map) and array map_values(map) map_keys() takes a map as input and returns an array consisting of the key values in the supplied map. Similarly, map_values() takes a map as input and returns an array containing the map value fields. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1734) Implement map_keys() and map_values() UDFs
[ https://issues.apache.org/jira/browse/HIVE-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eldon Stegall updated HIVE-1734: Attachment: MapKeys.java MapValues.java Hope this works for you. Implement map_keys() and map_values() UDFs -- Key: HIVE-1734 URL: https://issues.apache.org/jira/browse/HIVE-1734 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.6.0 Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: MapKeys.java, MapValues.java Implement the following UDFs: array map_keys(map) and array map_values(map) map_keys() takes a map as input and returns an array consisting of the key values in the supplied map. Similarly, map_values() takes a map as input and returns an array containing the map value fields. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1497) support COMMENT clause on CREATE INDEX, and add new command for SHOW INDEXES
[ https://issues.apache.org/jira/browse/HIVE-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Melick updated HIVE-1497: - Attachment: HIVE-1497.6.patch * Removed commented out code * Added a test case with index comment * Added FORMATTED keyword to optionally show column headers * Added test case for compound index Yongqiang, I'm not sure why we want to take out the line {nopanel} setFetchTask(createFetchTask(showIndexesDesc.getSchema())); {nopanel} The other functions in the SemanticAnalyzer have similar function calls. John, we get the column headers directly from the schema we set inside of ShowIndexesDesc, so it would be difficult to pluralize col_name. Do you know what other show commands do? support COMMENT clause on CREATE INDEX, and add new command for SHOW INDEXES Key: HIVE-1497 URL: https://issues.apache.org/jira/browse/HIVE-1497 Project: Hive Issue Type: Improvement Components: Indexing Affects Versions: 0.7.0 Reporter: John Sichi Assignee: Russell Melick Fix For: 0.7.0 Attachments: HIVE-1497.4.patch, HIVE-1497.5.patch, HIVE-1497.6.patch, hive-1497.p1.patch, hive-1497.p2.patch, hive-1497.p3.patch We need to work out the syntax for SHOW/DESCRIBE, taking partitioning into account. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1497) support COMMENT clause on CREATE INDEX, and add new command for SHOW INDEXES
[ https://issues.apache.org/jira/browse/HIVE-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Melick updated HIVE-1497: - Status: Patch Available (was: Open) support COMMENT clause on CREATE INDEX, and add new command for SHOW INDEXES Key: HIVE-1497 URL: https://issues.apache.org/jira/browse/HIVE-1497 Project: Hive Issue Type: Improvement Components: Indexing Affects Versions: 0.7.0 Reporter: John Sichi Assignee: Russell Melick Fix For: 0.7.0 Attachments: HIVE-1497.4.patch, HIVE-1497.5.patch, HIVE-1497.6.patch, hive-1497.p1.patch, hive-1497.p2.patch, hive-1497.p3.patch We need to work out the syntax for SHOW/DESCRIBE, taking partitioning into account. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1743) Group-by to determine equals of Keys in reverse order
[ https://issues.apache.org/jira/browse/HIVE-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1743: - Resolution: Fixed Fix Version/s: 0.7.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Siying Group-by to determine equals of Keys in reverse order - Key: HIVE-1743 URL: https://issues.apache.org/jira/browse/HIVE-1743 Project: Hive Issue Type: Improvement Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Fix For: 0.7.0 Attachments: HIVE-1743.1.patch When processing group-by, in reduce side, keys are ordered. Comparing equality of two keys can be more efficient in reverse order. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1734) Implement map_keys() and map_values() UDFs
[ https://issues.apache.org/jira/browse/HIVE-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930921#action_12930921 ] Namit Jain commented on HIVE-1734: -- Eldon, you need to run all the tests and modify FunctionRegistry to add a corresponding udf. You also add to add unit tests for these functions. Also, it would be simpler to add a patch containing all the changes instead of different files separately Implement map_keys() and map_values() UDFs -- Key: HIVE-1734 URL: https://issues.apache.org/jira/browse/HIVE-1734 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.6.0 Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: MapKeys.java, MapValues.java Implement the following UDFs: array map_keys(map) and array map_values(map) map_keys() takes a map as input and returns an array consisting of the key values in the supplied map. Similarly, map_values() takes a map as input and returns an array containing the map value fields. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1734) Implement map_keys() and map_values() UDFs
[ https://issues.apache.org/jira/browse/HIVE-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain reassigned HIVE-1734: Assignee: Eldon Stegall (was: Carl Steinbach) Implement map_keys() and map_values() UDFs -- Key: HIVE-1734 URL: https://issues.apache.org/jira/browse/HIVE-1734 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.6.0 Reporter: Carl Steinbach Assignee: Eldon Stegall Attachments: MapKeys.java, MapValues.java Implement the following UDFs: array map_keys(map) and array map_values(map) map_keys() takes a map as input and returns an array consisting of the key values in the supplied map. Similarly, map_values() takes a map as input and returns an array containing the map value fields. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1734) Implement map_keys() and map_values() UDFs
[ https://issues.apache.org/jira/browse/HIVE-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1734: - Status: Open (was: Patch Available) Implement map_keys() and map_values() UDFs -- Key: HIVE-1734 URL: https://issues.apache.org/jira/browse/HIVE-1734 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.6.0 Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: MapKeys.java, MapValues.java Implement the following UDFs: array map_keys(map) and array map_values(map) map_keys() takes a map as input and returns an array consisting of the key values in the supplied map. Similarly, map_values() takes a map as input and returns an array containing the map value fields. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1780) Typo in hive-default.xml
[ https://issues.apache.org/jira/browse/HIVE-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1780: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Youngwoo Typo in hive-default.xml Key: HIVE-1780 URL: https://issues.apache.org/jira/browse/HIVE-1780 Project: Hive Issue Type: Bug Components: Configuration Reporter: YoungWoo Kim Assignee: YoungWoo Kim Priority: Trivial Fix For: 0.7.0 Attachments: HIVE-1780.patch 'CombineHiveInputFormat' is spelt incorrectly in the hive-default.xml: It should be 'CombineHiveInputFormat' instead of 'CombinedHiveInputFormat'. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1758) optimize group by hash map memory
[ https://issues.apache.org/jira/browse/HIVE-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1758: - Resolution: Fixed Fix Version/s: 0.7.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Siying optimize group by hash map memory - Key: HIVE-1758 URL: https://issues.apache.org/jira/browse/HIVE-1758 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Siying Dong Fix For: 0.7.0 Attachments: HIVE-1758.1.patch Group By map side's hash map consumes a lot of memory, thereby decreasing its effectiveness. We can use some of the optimizations from map-join to reduce the memory footprint: class KeyWrapper { int hashcode; ArrayListObject keys; // decide whether this is already in hashmap (keys in hashmap are deepcopied // version, and we need to use 'currentKeyObjectInspector'). boolean copy = false; 1. Changes keys to Array 2. Optimize the scenario when keys is of a small size (1,2) etc Let us start profiling it and take it from there -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-78) Authorization infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-78: - Attachment: HIVE-78.2.thrift.patch HIVE-78.2.nothrift.patch Attached 2 new draft patches. There maybe some bugs since i only did a few simple tests. But i think they are ready for early review. HIVE-78.2.nothrift.patch does not include the thrift changes. HIVE-78.2.thrift.patch is a complete patch. Authorization infrastructure for Hive - Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hive Issue Type: New Feature Components: Metastore, Query Processor, Server Infrastructure Reporter: Ashish Thusoo Assignee: He Yongqiang Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, HIVE-78.1.nothrift.patch, HIVE-78.1.thrift.patch, HIVE-78.2.nothrift.patch, HIVE-78.2.thrift.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.