[jira] [Created] (HIVE-3790) UDF to find DATE OFFSET for a given date or timestamp
Jithin John created HIVE-3790: - Summary: UDF to find DATE OFFSET for a given date or timestamp Key: HIVE-3790 URL: https://issues.apache.org/jira/browse/HIVE-3790 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: Jithin John Current releases of Hive lacks a generic function which would find the date offset to a date / timestamp. Current releases have date_add (date) and date_sub(date) which allows user to add or substract days only.But we could not use year or month as a unit. The Function DATE_OFFSET(date,offset,unit) returns the date offset value from start_date according to the unit. Here the unit can be year , month and day. The function could be used for date range queries and is more flexible than the existing functions. Functionality :- Function Name: DATE_OFFSET(date,offset,unit) Add a offset value to the unit part of the date. Returns the date in the format of -MM-dd . Example: hive select date_offset('2009-07-29', -1 ,'MONTH' ) FROM src LIMIT 1 - 2009-06-29 Usage :- Case : To calculate the expiry date of a item from manufacturing date Table :- ITEM_TAB Manufacturing_date |item id|store id|value|unit|price 2012-12-01|110001|00003|0.99|1.00|0.99 2012-12-02|110001|00008|0.99|0.00|0.00 2012-12-03|110001|00009|0.99|0.00|0.00 2012-12-04|110001|001112002|0.99|0.00|0.00 2012-12-05|110001|001112003|0.99|0.00|0.00 2012-12-06|110001|001112006|0.99|1.00|0.99 2012-12-07|110001|001112007|0.99|0.00|0.00 2012-12-08|110001|001112008|0.99|0.00|0.00 2012-12-09|110001|001112009|0.99|0.00|0.00 2012-12-10|110001|001112010|0.99|0.00|0.00 2012-12-11|110001|001113003|0.99|0.00|0.00 2012-12-12|110001|001113006|0.99|0.00|0.00 2012-12-13|110001|001113008|0.99|0.00|0.00 2012-12-14|110001|001113010|0.99|0.00|0.00 2012-12-15|110001|001114002|0.99|0.00|0.00 2012-12-16|110001|001114004|0.99|1.00|0.99 2012-12-17|110001|001114005|0.99|0.00|0.00 2012-12-18|110001|001121004|0.99|0.00|0.00 QUERY: select man_date , date_offset(man_date ,5 ,'year') as expiry_date from item_tab; RESULT: 2012-12-01 2017-12-01 2012-12-02 2017-12-02 2012-12-03 2017-12-03 2012-12-04 2017-12-04 2012-12-05 2017-12-05 2012-12-06 2017-12-06 2012-12-07 2017-12-07 2012-12-08 2017-12-08 2012-12-09 2017-12-09 2012-12-10 2017-12-10 2012-12-11 2017-12-11 2012-12-12 2017-12-12 2012-12-13 2017-12-13 2012-12-14 2017-12-14 2012-12-15 2017-12-15 2012-12-16 2017-12-16 2012-12-17 2017-12-17 2012-12-18 2017-12-18 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3790) UDF to find DATE OFFSET for a given date or timestamp
[ https://issues.apache.org/jira/browse/HIVE-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jithin John updated HIVE-3790: -- Description: Current releases of Hive lacks a generic function which would find the date offset to a date / timestamp. Current releases have date_add (date) and date_sub(date) which allows user to add or substract days only.But we could not use year or month as a unit. The Function DATE_OFFSET(date,offset,unit) returns the date offset value from start_date according to the unit. Here the unit can be year , month and day. The function could be used for date range queries and is more flexible than the existing functions. Functionality :- Function Name: DATE_OFFSET(date,offset,unit) Add a offset value to the unit part of the date/timestamp. Returns the date in the format of -MM-dd . Example: hive select date_offset('2009-07-29', -1 ,'MONTH' ) FROM src LIMIT 1 - 2009-06-29 Usage :- Case : To calculate the expiry date of a item from manufacturing date Table :- ITEM_TAB Manufacturing_date |item id|store id|value|unit|price 2012-12-01|110001|00003|0.99|1.00|0.99 2012-12-02|110001|00008|0.99|0.00|0.00 2012-12-03|110001|00009|0.99|0.00|0.00 2012-12-04|110001|001112002|0.99|0.00|0.00 2012-12-05|110001|001112003|0.99|0.00|0.00 2012-12-06|110001|001112006|0.99|1.00|0.99 2012-12-07|110001|001112007|0.99|0.00|0.00 2012-12-08|110001|001112008|0.99|0.00|0.00 2012-12-09|110001|001112009|0.99|0.00|0.00 2012-12-10|110001|001112010|0.99|0.00|0.00 2012-12-11|110001|001113003|0.99|0.00|0.00 2012-12-12|110001|001113006|0.99|0.00|0.00 2012-12-13|110001|001113008|0.99|0.00|0.00 2012-12-14|110001|001113010|0.99|0.00|0.00 2012-12-15|110001|001114002|0.99|0.00|0.00 2012-12-16|110001|001114004|0.99|1.00|0.99 2012-12-17|110001|001114005|0.99|0.00|0.00 2012-12-18|110001|001121004|0.99|0.00|0.00 QUERY: select man_date , date_offset(man_date ,5 ,'year') as expiry_date from item_tab; RESULT: 2012-12-01 2017-12-01 2012-12-02 2017-12-02 2012-12-03 2017-12-03 2012-12-04 2017-12-04 2012-12-05 2017-12-05 2012-12-06 2017-12-06 2012-12-07 2017-12-07 2012-12-08 2017-12-08 2012-12-09 2017-12-09 2012-12-10 2017-12-10 2012-12-11 2017-12-11 2012-12-12 2017-12-12 2012-12-13 2017-12-13 2012-12-14 2017-12-14 2012-12-15 2017-12-15 2012-12-16 2017-12-16 2012-12-17 2017-12-17 2012-12-18 2017-12-18 was: Current releases of Hive lacks a generic function which would find the date offset to a date / timestamp. Current releases have date_add (date) and date_sub(date) which allows user to add or substract days only.But we could not use year or month as a unit. The Function DATE_OFFSET(date,offset,unit) returns the date offset value from start_date according to the unit. Here the unit can be year , month and day. The function could be used for date range queries and is more flexible than the existing functions. Functionality :- Function Name: DATE_OFFSET(date,offset,unit) Add a offset value to the unit part of the date. Returns the date in the format of -MM-dd . Example: hive select date_offset('2009-07-29', -1 ,'MONTH' ) FROM src LIMIT 1 - 2009-06-29 Usage :- Case : To calculate the expiry date of a item from manufacturing date Table :- ITEM_TAB Manufacturing_date |item id|store id|value|unit|price 2012-12-01|110001|00003|0.99|1.00|0.99 2012-12-02|110001|00008|0.99|0.00|0.00 2012-12-03|110001|00009|0.99|0.00|0.00 2012-12-04|110001|001112002|0.99|0.00|0.00 2012-12-05|110001|001112003|0.99|0.00|0.00 2012-12-06|110001|001112006|0.99|1.00|0.99 2012-12-07|110001|001112007|0.99|0.00|0.00 2012-12-08|110001|001112008|0.99|0.00|0.00 2012-12-09|110001|001112009|0.99|0.00|0.00 2012-12-10|110001|001112010|0.99|0.00|0.00 2012-12-11|110001|001113003|0.99|0.00|0.00 2012-12-12|110001|001113006|0.99|0.00|0.00 2012-12-13|110001|001113008|0.99|0.00|0.00 2012-12-14|110001|001113010|0.99|0.00|0.00 2012-12-15|110001|001114002|0.99|0.00|0.00 2012-12-16|110001|001114004|0.99|1.00|0.99 2012-12-17|110001|001114005|0.99|0.00|0.00 2012-12-18|110001|001121004|0.99|0.00|0.00 QUERY: select man_date , date_offset(man_date ,5 ,'year') as expiry_date from item_tab; RESULT: 2012-12-01 2017-12-01 2012-12-02 2017-12-02 2012-12-03 2017-12-03 2012-12-04 2017-12-04 2012-12-05 2017-12-05 2012-12-06 2017-12-06 2012-12-07 2017-12-07 2012-12-08 2017-12-08 2012-12-09 2017-12-09 2012-12-10 2017-12-10 2012-12-11 2017-12-11 2012-12-12 2017-12-12 2012-12-13 2017-12-13 2012-12-14 2017-12-14 2012-12-15 2017-12-15 2012-12-16
[jira] [Commented] (HIVE-3790) UDF to find DATE OFFSET for a given date or timestamp
[ https://issues.apache.org/jira/browse/HIVE-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528836#comment-13528836 ] Georgy B Abraham commented on HIVE-3790: UDF would be useful in manipulating the day , month or year of any given date by the offset passed. eg: date_offset('2009-07-29', -1 ,'MONTH' ) would give us the date 2009-06-29. Should have to consider Leapyears Februvary scenario too UDF to find DATE OFFSET for a given date or timestamp -- Key: HIVE-3790 URL: https://issues.apache.org/jira/browse/HIVE-3790 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: Jithin John Current releases of Hive lacks a generic function which would find the date offset to a date / timestamp. Current releases have date_add (date) and date_sub(date) which allows user to add or substract days only.But we could not use year or month as a unit. The Function DATE_OFFSET(date,offset,unit) returns the date offset value from start_date according to the unit. Here the unit can be year , month and day. The function could be used for date range queries and is more flexible than the existing functions. Functionality :- Function Name: DATE_OFFSET(date,offset,unit) Add a offset value to the unit part of the date/timestamp. Returns the date in the format of -MM-dd . Example: hive select date_offset('2009-07-29', -1 ,'MONTH' ) FROM src LIMIT 1 - 2009-06-29 Usage :- Case : To calculate the expiry date of a item from manufacturing date Table :- ITEM_TAB Manufacturing_date |item id|store id|value|unit|price 2012-12-01|110001|00003|0.99|1.00|0.99 2012-12-02|110001|00008|0.99|0.00|0.00 2012-12-03|110001|00009|0.99|0.00|0.00 2012-12-04|110001|001112002|0.99|0.00|0.00 2012-12-05|110001|001112003|0.99|0.00|0.00 2012-12-06|110001|001112006|0.99|1.00|0.99 2012-12-07|110001|001112007|0.99|0.00|0.00 2012-12-08|110001|001112008|0.99|0.00|0.00 2012-12-09|110001|001112009|0.99|0.00|0.00 2012-12-10|110001|001112010|0.99|0.00|0.00 2012-12-11|110001|001113003|0.99|0.00|0.00 2012-12-12|110001|001113006|0.99|0.00|0.00 2012-12-13|110001|001113008|0.99|0.00|0.00 2012-12-14|110001|001113010|0.99|0.00|0.00 2012-12-15|110001|001114002|0.99|0.00|0.00 2012-12-16|110001|001114004|0.99|1.00|0.99 2012-12-17|110001|001114005|0.99|0.00|0.00 2012-12-18|110001|001121004|0.99|0.00|0.00 QUERY: select man_date , date_offset(man_date ,5 ,'year') as expiry_date from item_tab; RESULT: 2012-12-01 2017-12-01 2012-12-02 2017-12-02 2012-12-03 2017-12-03 2012-12-04 2017-12-04 2012-12-05 2017-12-05 2012-12-06 2017-12-06 2012-12-07 2017-12-07 2012-12-08 2017-12-08 2012-12-09 2017-12-09 2012-12-10 2017-12-10 2012-12-11 2017-12-11 2012-12-12 2017-12-12 2012-12-13 2017-12-13 2012-12-14 2017-12-14 2012-12-15 2017-12-15 2012-12-16 2017-12-16 2012-12-17 2017-12-17 2012-12-18 2017-12-18 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3790) UDF to introduce an OFFSET(day,month or year) for a given date or timestamp
[ https://issues.apache.org/jira/browse/HIVE-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Georgy B Abraham updated HIVE-3790: --- Summary: UDF to introduce an OFFSET(day,month or year) for a given date or timestamp (was: UDF to find DATE OFFSET for a given date or timestamp ) UDF to introduce an OFFSET(day,month or year) for a given date or timestamp Key: HIVE-3790 URL: https://issues.apache.org/jira/browse/HIVE-3790 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: Jithin John Current releases of Hive lacks a generic function which would find the date offset to a date / timestamp. Current releases have date_add (date) and date_sub(date) which allows user to add or substract days only.But we could not use year or month as a unit. The Function DATE_OFFSET(date,offset,unit) returns the date offset value from start_date according to the unit. Here the unit can be year , month and day. The function could be used for date range queries and is more flexible than the existing functions. Functionality :- Function Name: DATE_OFFSET(date,offset,unit) Add a offset value to the unit part of the date/timestamp. Returns the date in the format of -MM-dd . Example: hive select date_offset('2009-07-29', -1 ,'MONTH' ) FROM src LIMIT 1 - 2009-06-29 Usage :- Case : To calculate the expiry date of a item from manufacturing date Table :- ITEM_TAB Manufacturing_date |item id|store id|value|unit|price 2012-12-01|110001|00003|0.99|1.00|0.99 2012-12-02|110001|00008|0.99|0.00|0.00 2012-12-03|110001|00009|0.99|0.00|0.00 2012-12-04|110001|001112002|0.99|0.00|0.00 2012-12-05|110001|001112003|0.99|0.00|0.00 2012-12-06|110001|001112006|0.99|1.00|0.99 2012-12-07|110001|001112007|0.99|0.00|0.00 2012-12-08|110001|001112008|0.99|0.00|0.00 2012-12-09|110001|001112009|0.99|0.00|0.00 2012-12-10|110001|001112010|0.99|0.00|0.00 2012-12-11|110001|001113003|0.99|0.00|0.00 2012-12-12|110001|001113006|0.99|0.00|0.00 2012-12-13|110001|001113008|0.99|0.00|0.00 2012-12-14|110001|001113010|0.99|0.00|0.00 2012-12-15|110001|001114002|0.99|0.00|0.00 2012-12-16|110001|001114004|0.99|1.00|0.99 2012-12-17|110001|001114005|0.99|0.00|0.00 2012-12-18|110001|001121004|0.99|0.00|0.00 QUERY: select man_date , date_offset(man_date ,5 ,'year') as expiry_date from item_tab; RESULT: 2012-12-01 2017-12-01 2012-12-02 2017-12-02 2012-12-03 2017-12-03 2012-12-04 2017-12-04 2012-12-05 2017-12-05 2012-12-06 2017-12-06 2012-12-07 2017-12-07 2012-12-08 2017-12-08 2012-12-09 2017-12-09 2012-12-10 2017-12-10 2012-12-11 2017-12-11 2012-12-12 2017-12-12 2012-12-13 2017-12-13 2012-12-14 2017-12-14 2012-12-15 2017-12-15 2012-12-16 2017-12-16 2012-12-17 2017-12-17 2012-12-18 2017-12-18 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2439) Upgrade antlr version to 3.4
[ https://issues.apache.org/jira/browse/HIVE-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528855#comment-13528855 ] Thiruvel Thirumoolan commented on HIVE-2439: Looks like Phabricator is on version 5 and arc is on version 6. Unable to post patch on phabricator. Phabricator doesnt seem to have any branches, so gets difficult trying to get it from one of their trees before the version bump. Will try posting couple of days later. Upgrade antlr version to 3.4 Key: HIVE-2439 URL: https://issues.apache.org/jira/browse/HIVE-2439 Project: Hive Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Ashutosh Chauhan Attachments: HIVE-2439_branch9_2.patch, HIVE-2439_branch9_3.patch, HIVE-2439_branch9.patch, hive-2439_incomplete.patch, HIVE-2439_trunk.patch Upgrade antlr version to 3.4 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3768) Document JDBC client configuration for secure clusters
[ https://issues.apache.org/jira/browse/HIVE-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528861#comment-13528861 ] Lefty Leverenz commented on HIVE-3768: -- The patch converts the wiki page [Hive JDBC Interface|https://cwiki.apache.org/confluence/display/Hive/HiveJDBCInterface] to xml and adds a one-sentence section for secure cluster setup: * Hive JDBC Driver ** Integration with Pentaho ** Integration with SQuirrel SQL Client * Hive JDBC Client ** JDBC Client Setup for a Secure Cluster \\ To configure Hive on a secure cluster, add the directory containing hive-site.xml to the CLASSPATH of the JDBC client. Document JDBC client configuration for secure clusters -- Key: HIVE-3768 URL: https://issues.apache.org/jira/browse/HIVE-3768 Project: Hive Issue Type: Bug Components: Documentation Affects Versions: 0.9.0 Reporter: Lefty Leverenz Assignee: Lefty Leverenz Fix For: 0.10.0 Attachments: HIVE-3768.1.patch Document the JDBC client configuration required for starting Hive on a secure cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #226
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/226/ -- [...truncated 9916 lines...] compile-test: [echo] Project: serde [javac] Compiling 26 source files to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/226/artifact/hive/build/serde/test/classes [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. create-dirs: [echo] Project: service [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/service/src/test/resources does not exist. init: [echo] Project: service ivy-init-settings: [echo] Project: service ivy-resolve: [echo] Project: service [ivy:resolve] :: loading settings :: file = https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml [ivy:report] Processing https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/226/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-service-default.xml to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/226/artifact/hive/build/ivy/report/org.apache.hive-hive-service-default.html ivy-retrieve: [echo] Project: service compile: [echo] Project: service ivy-resolve-test: [echo] Project: service ivy-retrieve-test: [echo] Project: service compile-test: [echo] Project: service [javac] Compiling 2 source files to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/226/artifact/hive/build/service/test/classes test: [echo] Project: hive test-shims: [echo] Project: hive test-conditions: [echo] Project: shims gen-test: [echo] Project: shims create-dirs: [echo] Project: shims [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/test/resources does not exist. init: [echo] Project: shims ivy-init-settings: [echo] Project: shims ivy-resolve: [echo] Project: shims [ivy:resolve] :: loading settings :: file = https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml [ivy:report] Processing https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/226/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-shims-default.xml to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/226/artifact/hive/build/ivy/report/org.apache.hive-hive-shims-default.html ivy-retrieve: [echo] Project: shims compile: [echo] Project: shims [echo] Building shims 0.20 build_shims: [echo] Project: shims [echo] Compiling https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/common/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.20/java against hadoop 0.20.2 (https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/226/artifact/hive/build/hadoopcore/hadoop-0.20.2) ivy-init-settings: [echo] Project: shims ivy-resolve-hadoop-shim: [echo] Project: shims [ivy:resolve] :: loading settings :: file = https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml ivy-retrieve-hadoop-shim: [echo] Project: shims [echo] Building shims 0.20S build_shims: [echo] Project: shims [echo] Compiling https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/common/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/common-secure/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.20S/java against hadoop 1.0.0 (https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/226/artifact/hive/build/hadoopcore/hadoop-1.0.0) ivy-init-settings: [echo] Project: shims ivy-resolve-hadoop-shim: [echo] Project: shims [ivy:resolve] :: loading settings :: file = https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml ivy-retrieve-hadoop-shim: [echo] Project: shims [echo] Building shims 0.23 build_shims: [echo] Project: shims [echo] Compiling https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/common/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/common-secure/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.23/java against hadoop 0.23.3 (https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/226/artifact/hive/build/hadoopcore/hadoop-0.23.3)
[jira] [Resolved] (HIVE-887) Allow SELECT col without a mapreduce job
[ https://issues.apache.org/jira/browse/HIVE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-887. --- Resolution: Fixed Fix Version/s: 0.10.0 Release Note: Set hive-conf hive.fetch.task.conversion to more to make use of this feature. Turned-off by default. Most of this got implemented in HIVE-2925 Allow SELECT col without a mapreduce job -- Key: HIVE-887 URL: https://issues.apache.org/jira/browse/HIVE-887 Project: Hive Issue Type: New Feature Environment: All Reporter: Eric Sun Assignee: Ning Zhang Fix For: 0.10.0 I often find myself needing to take a quick look at a particular column of a Hive table. I usually do this by doing a SELECT * from table LIMIT 20; from the CLI. Doing this is pretty fast since it doesn't require a mapreduce job. However, it's tough to examine just 1 or 2 columns when the table is very wide. So, I might do SELECT col from table LIMIT 20; but it's much slower since it requires a map-reduce. It'd be really convenient if a map-reduce wasn't necessary. Currently a good work around is to do hive -e select * from table | cut --key=n but it'd be more convenient if it were built in since it alleviates the need for column counting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2935) Implement HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529113#comment-13529113 ] Rob Weltman commented on HIVE-2935: --- [~namit] - The JIRA for the core patch part is https://issues.apache.org/jira/browse/HIVE-3785 Implement HiveServer2 - Key: HIVE-2935 URL: https://issues.apache.org/jira/browse/HIVE-2935 Project: Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Labels: HiveServer2 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, HS2-changed-files-only.patch, HS2-with-thrift-patch-rebased.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3791) starting hive with --auxpath is not same as creating auxlib
Sudhanshu Arora created HIVE-3791: - Summary: starting hive with --auxpath is not same as creating auxlib Key: HIVE-3791 URL: https://issues.apache.org/jira/browse/HIVE-3791 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.9.0 Reporter: Sudhanshu Arora If I give two jars in hive class path say (jar x, jar y) as hive --auxpath jarx:jary and issue a query Select * from test. It works fine. However, if I say select x from test which does a map reduce job, I get an error saying ava.io.FileNotFoundException: File file:x.jar:y.jar does not exist. If I create an auxlib directory under hive home and put the jars in that, then again everything works fine. --auxpath and auxlib should work in exactly the same way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
RE: stats19.q is failing on the current trunk?
I'm looking into it From: Navis류승우 [navis@nexr.com] Sent: Monday, December 10, 2012 9:48 PM To: dev@hive.apache.org Subject: Re: stats19.q is failing on the current trunk? It's booked on https://issues.apache.org/jira/browse/HIVE-3783 and I also seen this. 2012/12/11 Zhenxiao Luo zhenx...@fb.com: Always get this diff: [junit] diff -a /home/zhenxiao/Code/hive/build/ql/test/logs/clientpositive/stats19.q.out /home/zhenxiao/Code/hive/ql/src/test/results/clientpositive/stats19.q.out [junit] 21,22c21,22 [junit] Stats prefix is hashed: false [junit] Stats prefix is hashed: false [junit] --- [junit] Stats prefix is hashed: true [junit] Stats prefix is hashed: true [junit] 284,285c284,285 [junit] Stats prefix is hashed: false [junit] Stats prefix is hashed: false [junit] --- [junit] Stats prefix is hashed: true [junit] Stats prefix is hashed: true Will file a Jira if other people found it, too. Thanks, Zhenxiao
[jira] [Assigned] (HIVE-3783) stats19.q is failing on trunk
[ https://issues.apache.org/jira/browse/HIVE-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong reassigned HIVE-3783: --- Assignee: Kevin Wilfong stats19.q is failing on trunk - Key: HIVE-3783 URL: https://issues.apache.org/jira/browse/HIVE-3783 Project: Hive Issue Type: Bug Affects Versions: 0.11 Reporter: Ashutosh Chauhan Assignee: Kevin Wilfong This test-case was introduced in HIVE-3750 and is failing since as soon as it was introduced. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3785) Core hive changes for HiveServer2 implementation
[ https://issues.apache.org/jira/browse/HIVE-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529264#comment-13529264 ] Prasad Mujumdar commented on HIVE-3785: --- Code review request on https://reviews.facebook.net/D7281 Core hive changes for HiveServer2 implementation Key: HIVE-3785 URL: https://issues.apache.org/jira/browse/HIVE-3785 Project: Hive Issue Type: Sub-task Components: Authentication, Build Infrastructure, Configuration, Thrift API Affects Versions: 0.10.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HS2-changed-files-only.patch The subtask to track changes in the core hive components for HiveServer2 implementation -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3783) stats19.q is failing on trunk
[ https://issues.apache.org/jira/browse/HIVE-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529310#comment-13529310 ] Kevin Wilfong commented on HIVE-3783: - https://reviews.facebook.net/D7287 stats19.q is failing on trunk - Key: HIVE-3783 URL: https://issues.apache.org/jira/browse/HIVE-3783 Project: Hive Issue Type: Bug Affects Versions: 0.11 Reporter: Ashutosh Chauhan Assignee: Kevin Wilfong This test-case was introduced in HIVE-3750 and is failing since as soon as it was introduced. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3783) stats19.q is failing on trunk
[ https://issues.apache.org/jira/browse/HIVE-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3783: Attachment: HIVE-3783.1.patch.txt stats19.q is failing on trunk - Key: HIVE-3783 URL: https://issues.apache.org/jira/browse/HIVE-3783 Project: Hive Issue Type: Bug Affects Versions: 0.11 Reporter: Ashutosh Chauhan Assignee: Kevin Wilfong Attachments: HIVE-3783.1.patch.txt This test-case was introduced in HIVE-3750 and is failing since as soon as it was introduced. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3783) stats19.q is failing on trunk
[ https://issues.apache.org/jira/browse/HIVE-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3783: Status: Patch Available (was: Open) stats19.q is failing on trunk - Key: HIVE-3783 URL: https://issues.apache.org/jira/browse/HIVE-3783 Project: Hive Issue Type: Bug Affects Versions: 0.11 Reporter: Ashutosh Chauhan Assignee: Kevin Wilfong Attachments: HIVE-3783.1.patch.txt This test-case was introduced in HIVE-3750 and is failing since as soon as it was introduced. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3783) stats19.q is failing on trunk
[ https://issues.apache.org/jira/browse/HIVE-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529316#comment-13529316 ] Kevin Wilfong commented on HIVE-3783: - Found one reason for the test failing in some checkouts and not others. I've run the test in a directory where it was previously failing 5 times, and so far it's been passing consistently. stats19.q is failing on trunk - Key: HIVE-3783 URL: https://issues.apache.org/jira/browse/HIVE-3783 Project: Hive Issue Type: Bug Affects Versions: 0.11 Reporter: Ashutosh Chauhan Assignee: Kevin Wilfong Attachments: HIVE-3783.1.patch.txt This test-case was introduced in HIVE-3750 and is failing since as soon as it was introduced. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 1850 - Still Failing
Changes for Build #1844 [hashutosh] HIVE-3705 : Adding authorization capability to the metastore (Sushanth Sowmyan via Ashutosh Chauhan) Changes for Build #1845 [hashutosh] HIVE-3231 : msck repair should find partitions already containing data files (Keegan Mosley via Ashutosh Chauhan) [hashutosh] HIVE-2691 : Specify location of log4j configuration files via configuration properties (Zhenxiao Luo via Ashutosh Chauhan) [hashutosh] HIVE-2794 : Aggregations without grouping should return NULL when applied to partitioning column of a partitionless table (Zhenxiao Luo via Ashutosh Chauhan) [hashutosh] HIVE-3780 : RetryingMetaStoreClient Should Log the Caught Exception (Bhushan Mandhani via Ashutosh Chauhan) [hashutosh] HIVE-3084 : Hive CI failing due to script_broken_pipe1.q (Gunther Hagleitner via Ashutosh Chauhan) [hashutosh] HIVE-3760 : TestNegativeMinimrCliDriver_mapreduce_stack_trace.q fails on hadoop-1 (Gunther Hagleitner via Ashutosh Chauhan) Changes for Build #1846 Changes for Build #1847 [hashutosh] HIVE-3714 : Patch: Hive's ivy internal resolvers need to use sourceforge for sqlline (Gopal V via Ashutosh Chauhan) Changes for Build #1848 [hashutosh] HIVE-3782 : testCliDriver_sample_islocalmode_hook fails on hadoop-1 (Gunther Hagleitner via Ashutosh Chauhan) [hashutosh] HIVE-2288 : Adding the oracle nvl function to the UDF (Ed Capriolo, Guy Doulberg via Ashutosh Chauhan) [hashutosh] HIVE-2689 : ObjectInspectorConverters cannot convert Void types to Array/Map/Struct types. (Jonathan Chang via Ashutosh Chauhan) Changes for Build #1849 [hashutosh] HIVE-3622 : reflect udf cannot find method which has arguments of primitive types and String, Binary, Timestamp types mixed (Navis via Ashutosh Chauhan) [namit] HIVE-3401 Diversify grammar for split sampling (Navis via namit) Changes for Build #1850 1 tests failed. FAILED: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats19 Error Message: Unexpected exception See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. Stack Trace: junit.framework.AssertionFailedError: Unexpected exception See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. at junit.framework.Assert.fail(Assert.java:47) at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats19(TestCliDriver.java:41417) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:232) at junit.framework.TestSuite.run(TestSuite.java:227) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906) The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1850) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1850/ to view the results.
Jenkins build is back to normal : Hive-0.9.1-SNAPSHOT-h0.21 #226
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/226/
[jira] [Updated] (HIVE-3724) Metastore tests use hardcoded ports
[ https://issues.apache.org/jira/browse/HIVE-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3724: --- Fix Version/s: (was: 0.11) 0.10.0 Metastore tests use hardcoded ports --- Key: HIVE-3724 URL: https://issues.apache.org/jira/browse/HIVE-3724 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Fix For: 0.10.0 Attachments: HIVE-3724.1.patch.txt, HIVE-3724.2.patch.txt, hive-3724.svn-0.10.patch Several of the metastore tests use hardcoded ports for remote metastore Thrift servers. This is causing transient failures in Jenkins, e.g. https://builds.apache.org/job/Hive-trunk-h0.21/1804/ A few tests already dynamically determine free ports, and this logic can be shared. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3724) Metastore tests use hardcoded ports
[ https://issues.apache.org/jira/browse/HIVE-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529406#comment-13529406 ] Ashutosh Chauhan commented on HIVE-3724: Committed to 0.10 branch. Thanks, Sushanth for backport! Metastore tests use hardcoded ports --- Key: HIVE-3724 URL: https://issues.apache.org/jira/browse/HIVE-3724 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Fix For: 0.10.0 Attachments: HIVE-3724.1.patch.txt, HIVE-3724.2.patch.txt, hive-3724.svn-0.10.patch Several of the metastore tests use hardcoded ports for remote metastore Thrift servers. This is causing transient failures in Jenkins, e.g. https://builds.apache.org/job/Hive-trunk-h0.21/1804/ A few tests already dynamically determine free ports, and this logic can be shared. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3705) Adding authorization capability to the metastore
[ https://issues.apache.org/jira/browse/HIVE-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3705: --- Fix Version/s: (was: 0.11) 0.10.0 Adding authorization capability to the metastore Key: HIVE-3705 URL: https://issues.apache.org/jira/browse/HIVE-3705 Project: Hive Issue Type: New Feature Components: Authorization, Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.10.0 Attachments: HIVE-3705.D6681.1.patch, HIVE-3705.D6681.2.patch, HIVE-3705.D6681.3.patch, HIVE-3705.D6681.4.patch, HIVE-3705.D6681.5.patch, HIVE-3705.giant.svn-0.10.patch, HIVE-3705.giant.svn.patch, hive-backend-auth.2.git.patch, hive-backend-auth.git.patch, hive-backend-auth.post-review.git.patch, hive-backend-auth.post-review-part2.git.patch, hive-backend-auth.post-review-part3.git.patch, hivesec_investigation.pdf In an environment where multiple clients access a single metastore, and we want to evolve hive security to a point where it's no longer simply preventing users from shooting their own foot, we need to be able to authorize metastore calls as well, instead of simply performing every metastore api call that's made. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3705) Adding authorization capability to the metastore
[ https://issues.apache.org/jira/browse/HIVE-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529478#comment-13529478 ] Ashutosh Chauhan commented on HIVE-3705: Committed to 0.10 branch. Thanks, Sushanth! Adding authorization capability to the metastore Key: HIVE-3705 URL: https://issues.apache.org/jira/browse/HIVE-3705 Project: Hive Issue Type: New Feature Components: Authorization, Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.10.0 Attachments: HIVE-3705.D6681.1.patch, HIVE-3705.D6681.2.patch, HIVE-3705.D6681.3.patch, HIVE-3705.D6681.4.patch, HIVE-3705.D6681.5.patch, HIVE-3705.giant.svn-0.10.patch, HIVE-3705.giant.svn.patch, hive-backend-auth.2.git.patch, hive-backend-auth.git.patch, hive-backend-auth.post-review.git.patch, hive-backend-auth.post-review-part2.git.patch, hive-backend-auth.post-review-part3.git.patch, hivesec_investigation.pdf In an environment where multiple clients access a single metastore, and we want to evolve hive security to a point where it's no longer simply preventing users from shooting their own foot, we need to be able to authorize metastore calls as well, instead of simply performing every metastore api call that's made. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3789) Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9
[ https://issues.apache.org/jira/browse/HIVE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529499#comment-13529499 ] Arup Malakar commented on HIVE-3789: Chris, I am seeing the errors too, I am investigating it. Meanwhile if someone has any clue on the following exception that I am seeing for the test failures: {code} [junit] Running org.apache.hadoop.hive.ql.parse.TestContribParse [junit] Cleaning up TestContribParse [junit] Exception: MetaException(message:Got exception: java.lang.IllegalArgumentException Wrong FS: pfile:/Users/malakar/code/oss/hive_09/hive/build/contrib/test/data/warehouse/src_json, expected: file:///) [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 4.433 sec [junit] org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: java.lang.IllegalArgumentException Wrong FS: pfile:/Users/malakar/code/oss/hive_09/hive/build/contrib/test/data/warehouse/src_json, expected: file:///) [junit] at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:813) [junit] at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:789) [junit] at org.apache.hadoop.hive.ql.QTestUtil.cleanUp(QTestUtil.java:421) [junit] at org.apache.hadoop.hive.ql.QTestUtil.shutdown(QTestUtil.java:278) [junit] at org.apache.hadoop.hive.ql.parse.TestContribParse.tearDown(TestContribParse.java:59) [junit] at junit.framework.TestCase.runBare(TestCase.java:140) [junit] at junit.framework.TestResult$1.protect(TestResult.java:110) [junit] at junit.framework.TestResult.runProtected(TestResult.java:128) [junit] at junit.framework.TestResult.run(TestResult.java:113) [junit] at junit.framework.TestCase.run(TestCase.java:124) [junit] at junit.framework.TestSuite.runTest(TestSuite.java:243) [junit] at junit.framework.TestSuite.run(TestSuite.java:238) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:520) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1060) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:911) [junit] Caused by: MetaException(message:Got exception: java.lang.IllegalArgumentException Wrong FS: pfile:/Users/malakar/code/oss/hive_09/hive/build/contrib/test/data/warehouse/src_json, expected: file:///) [junit] at org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:785) [junit] at org.apache.hadoop.hive.metastore.HiveMetaStoreFsImpl.deleteDir(HiveMetaStoreFsImpl.java:61) [junit] at org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:200) [junit] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:929) [junit] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table(HiveMetaStore.java:944) [junit] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:553) [junit] at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:807) [junit] ... 14 more [junit] Test org.apache.hadoop.hive.ql.parse.TestContribParse FAILED [for] /Users/malakar/code/oss/hive_09/hive/contrib/build.xml: The following error occurred while executing this line: [for] /Users/malakar/code/oss/hive_09/hive/build.xml:321: The following error occurred while executing this line: [for] /Users/malakar/code/oss/hive_09/hive/build-common.xml:448: Tests failed! {code} Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9 Key: HIVE-3789 URL: https://issues.apache.org/jira/browse/HIVE-3789 Project: Hive Issue Type: Bug Components: Metastore, Tests Affects Versions: 0.9.0 Environment: Hadooop 0.23.5, JDK 1.6.0_31 Reporter: Chris Drome Rolling back to before this patch shows that the unit tests are passing, after the patch, the majority of the unit tests are failing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3792) hive jars are not part of the share lib tar ball in oozie
Ashish Singh created HIVE-3792: -- Summary: hive jars are not part of the share lib tar ball in oozie Key: HIVE-3792 URL: https://issues.apache.org/jira/browse/HIVE-3792 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.10.0 Reporter: Ashish Singh Assignee: Ashish Singh Fix For: 0.10.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3792) hive jars are not part of the share lib tar ball in oozie
[ https://issues.apache.org/jira/browse/HIVE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singh updated HIVE-3792: --- Description: hive-0.10.0 pom file has missing conf and scope mapping for compile configuration. hive jars are not part of the share lib tar ball in oozie - Key: HIVE-3792 URL: https://issues.apache.org/jira/browse/HIVE-3792 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.10.0 Reporter: Ashish Singh Assignee: Ashish Singh Fix For: 0.10.0 hive-0.10.0 pom file has missing conf and scope mapping for compile configuration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3780) RetryingMetaStoreClient Should Log the Caught Exception
[ https://issues.apache.org/jira/browse/HIVE-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3780: --- Fix Version/s: (was: 0.11) 0.10.0 RetryingMetaStoreClient Should Log the Caught Exception --- Key: HIVE-3780 URL: https://issues.apache.org/jira/browse/HIVE-3780 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Trivial Fix For: 0.10.0 Attachments: HIVE-3780.1.patch.txt Currently it logs the cause of the caught exception. It should log the caught exception itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3780) RetryingMetaStoreClient Should Log the Caught Exception
[ https://issues.apache.org/jira/browse/HIVE-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529530#comment-13529530 ] Ashutosh Chauhan commented on HIVE-3780: Applied to 0.10 branch as well. RetryingMetaStoreClient Should Log the Caught Exception --- Key: HIVE-3780 URL: https://issues.apache.org/jira/browse/HIVE-3780 Project: Hive Issue Type: Bug Reporter: Bhushan Mandhani Assignee: Bhushan Mandhani Priority: Trivial Fix For: 0.10.0 Attachments: HIVE-3780.1.patch.txt Currently it logs the cause of the caught exception. It should log the caught exception itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3792) hive jars are not part of the share lib tar ball in oozie
[ https://issues.apache.org/jira/browse/HIVE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singh updated HIVE-3792: --- Attachment: HIVE-3792.patch hive jars are not part of the share lib tar ball in oozie - Key: HIVE-3792 URL: https://issues.apache.org/jira/browse/HIVE-3792 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.10.0 Reporter: Ashish Singh Assignee: Ashish Singh Fix For: 0.10.0 Attachments: HIVE-3792.patch hive-0.10.0 pom file has missing conf and scope mapping for compile configuration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3792) hive jars are not part of the share lib tar ball in oozie
[ https://issues.apache.org/jira/browse/HIVE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singh updated HIVE-3792: --- Status: Patch Available (was: Open) Attaching the patch, that has the mapping of conf to scope for compile configuration. hive jars are not part of the share lib tar ball in oozie - Key: HIVE-3792 URL: https://issues.apache.org/jira/browse/HIVE-3792 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.10.0 Reporter: Ashish Singh Assignee: Ashish Singh Fix For: 0.10.0 Attachments: HIVE-3792.patch hive-0.10.0 pom file has missing conf and scope mapping for compile configuration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3787) Regression introduced from HIVE-3401
[ https://issues.apache.org/jira/browse/HIVE-3787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3787: Status: Patch Available (was: Open) Regression introduced from HIVE-3401 Key: HIVE-3787 URL: https://issues.apache.org/jira/browse/HIVE-3787 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-3787.D7275.1.patch By HIVE-3562, split_sample_out_of_range.q and split_sample_wrong_format.q are not showing valid 'line:loc' information for error messages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3045) Partition column values are not valid if any of virtual columns is selected
[ https://issues.apache.org/jira/browse/HIVE-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3045: Resolution: Duplicate Status: Resolved (was: Patch Available) Yes, it's fixed by HIVE-2925. Partition column values are not valid if any of virtual columns is selected --- Key: HIVE-3045 URL: https://issues.apache.org/jira/browse/HIVE-3045 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Attachments: HIVE-3045.D3351.2.patch For example, {code} hive select * from srcpart where key 5; 0 val_0 2008-04-08 11 4 val_4 2008-04-08 11 0 val_0 2008-04-08 11 0 val_0 2008-04-08 11 2 val_2 2008-04-08 11 0 val_1 2008-04-09 12 4 val_5 2008-04-09 12 3 val_4 2008-04-09 12 2 val_3 2008-04-09 12 0 val_1 2008-04-09 12 1 val_2 2008-04-09 12 hive select *, BLOCK__OFFSET__INSIDE__FILE from srcpart where key 5; 0 val_0 2008-04-09 11 968 4 val_4 2008-04-09 11 1218 0 val_0 2008-04-09 11 2088 0 val_0 2008-04-09 11 2632 2 val_2 2008-04-09 11 4004 0 val_1 2008-04-09 11 682 4 val_5 2008-04-09 11 1131 3 val_4 2008-04-09 11 1163 2 val_3 2008-04-09 11 2629 0 val_1 2008-04-09 11 4367 1 val_2 2008-04-09 11 5669 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3045) Partition column values are not valid if any of virtual columns is selected
[ https://issues.apache.org/jira/browse/HIVE-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529559#comment-13529559 ] Phabricator commented on HIVE-3045: --- navis has abandoned the revision HIVE-3045 [jira] Partition column values are not valid if any of virtual columns is selected. Fixed by HIVE-2925 REVISION DETAIL https://reviews.facebook.net/D3351 To: JIRA, navis Partition column values are not valid if any of virtual columns is selected --- Key: HIVE-3045 URL: https://issues.apache.org/jira/browse/HIVE-3045 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Attachments: HIVE-3045.D3351.2.patch For example, {code} hive select * from srcpart where key 5; 0 val_0 2008-04-08 11 4 val_4 2008-04-08 11 0 val_0 2008-04-08 11 0 val_0 2008-04-08 11 2 val_2 2008-04-08 11 0 val_1 2008-04-09 12 4 val_5 2008-04-09 12 3 val_4 2008-04-09 12 2 val_3 2008-04-09 12 0 val_1 2008-04-09 12 1 val_2 2008-04-09 12 hive select *, BLOCK__OFFSET__INSIDE__FILE from srcpart where key 5; 0 val_0 2008-04-09 11 968 4 val_4 2008-04-09 11 1218 0 val_0 2008-04-09 11 2088 0 val_0 2008-04-09 11 2632 2 val_2 2008-04-09 11 4004 0 val_1 2008-04-09 11 682 4 val_5 2008-04-09 11 1131 3 val_4 2008-04-09 11 1163 2 val_3 2008-04-09 11 2629 0 val_1 2008-04-09 11 4367 1 val_2 2008-04-09 11 5669 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3045) Partition column values are not valid if any of virtual columns is selected
[ https://issues.apache.org/jira/browse/HIVE-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3045: --- Fix Version/s: 0.10.0 Partition column values are not valid if any of virtual columns is selected --- Key: HIVE-3045 URL: https://issues.apache.org/jira/browse/HIVE-3045 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Fix For: 0.10.0 Attachments: HIVE-3045.D3351.2.patch For example, {code} hive select * from srcpart where key 5; 0 val_0 2008-04-08 11 4 val_4 2008-04-08 11 0 val_0 2008-04-08 11 0 val_0 2008-04-08 11 2 val_2 2008-04-08 11 0 val_1 2008-04-09 12 4 val_5 2008-04-09 12 3 val_4 2008-04-09 12 2 val_3 2008-04-09 12 0 val_1 2008-04-09 12 1 val_2 2008-04-09 12 hive select *, BLOCK__OFFSET__INSIDE__FILE from srcpart where key 5; 0 val_0 2008-04-09 11 968 4 val_4 2008-04-09 11 1218 0 val_0 2008-04-09 11 2088 0 val_0 2008-04-09 11 2632 2 val_2 2008-04-09 11 4004 0 val_1 2008-04-09 11 682 4 val_5 2008-04-09 11 1131 3 val_4 2008-04-09 11 1163 2 val_3 2008-04-09 11 2629 0 val_1 2008-04-09 11 4367 1 val_2 2008-04-09 11 5669 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3766) Enable adding hooks to hive meta store init
[ https://issues.apache.org/jira/browse/HIVE-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529581#comment-13529581 ] Kevin Wilfong commented on HIVE-3766: - METASTORETHRIFTRETRIES has been changed to METASTORETHRIFTCONNECTIONRETRIES can you update the test? Enable adding hooks to hive meta store init --- Key: HIVE-3766 URL: https://issues.apache.org/jira/browse/HIVE-3766 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jean Xu Assignee: Jean Xu Attachments: jira3766.txt We will enable hooks to be added to init HMSHandler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3300) LOAD DATA INPATH fails if a hdfs file with same name is added to table
[ https://issues.apache.org/jira/browse/HIVE-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3300: -- Attachment: HIVE-3300.D4383.3.patch navis updated the revision HIVE-3300 [jira] LOAD DATA INPATH fails if a hdfs file with same name is added to table. Reviewers: JIRA 1. Removed staging stage for source files before copy/move 2. Rebase on trunk REVISION DETAIL https://reviews.facebook.net/D4383 AFFECTED FILES build-common.xml ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ql/src/test/queries/clientpositive/load_fs2.q ql/src/test/results/clientpositive/load_fs2.q.out To: JIRA, navis LOAD DATA INPATH fails if a hdfs file with same name is added to table -- Key: HIVE-3300 URL: https://issues.apache.org/jira/browse/HIVE-3300 Project: Hive Issue Type: Bug Components: Import/Export Affects Versions: 0.10.0 Environment: ubuntu linux, hadoop 1.0.3, hive 0.9 Reporter: Bejoy KS Assignee: Navis Attachments: HIVE-3300.1.patch.txt, HIVE-3300.D4383.3.patch If we are loading data from local fs to hive tables using 'LOAD DATA LOCAL INPATH' and if a file with the same name exists in the table's location then the new file will be suffixed by *_copy_1. But if we do the 'LOAD DATA INPATH' for a file in hdfs then there is no rename happening but just a move task is getting triggered. Since a file with same name exists in same hdfs location, hadoop fs move operation throws an error. hive LOAD DATA INPATH '/userdata/bejoy/site.txt' INTO TABLE test.site; Loading data to table test.site Failed with exception null FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask hive -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3790) UDF to introduce an OFFSET(day,month or year) for a given date or timestamp
[ https://issues.apache.org/jira/browse/HIVE-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529609#comment-13529609 ] Sun Rui commented on HIVE-3790: --- Have you considered adding support to the SQL interval type? So, for example, if you want to add '1 year 3 month' to a date, you don't have to have two calls, first is to call date1=date_offset(date, 1, 'YEAR'), then to call date_offset (date1,3, 'MONTH'). UDF to introduce an OFFSET(day,month or year) for a given date or timestamp Key: HIVE-3790 URL: https://issues.apache.org/jira/browse/HIVE-3790 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: Jithin John Current releases of Hive lacks a generic function which would find the date offset to a date / timestamp. Current releases have date_add (date) and date_sub(date) which allows user to add or substract days only.But we could not use year or month as a unit. The Function DATE_OFFSET(date,offset,unit) returns the date offset value from start_date according to the unit. Here the unit can be year , month and day. The function could be used for date range queries and is more flexible than the existing functions. Functionality :- Function Name: DATE_OFFSET(date,offset,unit) Add a offset value to the unit part of the date/timestamp. Returns the date in the format of -MM-dd . Example: hive select date_offset('2009-07-29', -1 ,'MONTH' ) FROM src LIMIT 1 - 2009-06-29 Usage :- Case : To calculate the expiry date of a item from manufacturing date Table :- ITEM_TAB Manufacturing_date |item id|store id|value|unit|price 2012-12-01|110001|00003|0.99|1.00|0.99 2012-12-02|110001|00008|0.99|0.00|0.00 2012-12-03|110001|00009|0.99|0.00|0.00 2012-12-04|110001|001112002|0.99|0.00|0.00 2012-12-05|110001|001112003|0.99|0.00|0.00 2012-12-06|110001|001112006|0.99|1.00|0.99 2012-12-07|110001|001112007|0.99|0.00|0.00 2012-12-08|110001|001112008|0.99|0.00|0.00 2012-12-09|110001|001112009|0.99|0.00|0.00 2012-12-10|110001|001112010|0.99|0.00|0.00 2012-12-11|110001|001113003|0.99|0.00|0.00 2012-12-12|110001|001113006|0.99|0.00|0.00 2012-12-13|110001|001113008|0.99|0.00|0.00 2012-12-14|110001|001113010|0.99|0.00|0.00 2012-12-15|110001|001114002|0.99|0.00|0.00 2012-12-16|110001|001114004|0.99|1.00|0.99 2012-12-17|110001|001114005|0.99|0.00|0.00 2012-12-18|110001|001121004|0.99|0.00|0.00 QUERY: select man_date , date_offset(man_date ,5 ,'year') as expiry_date from item_tab; RESULT: 2012-12-01 2017-12-01 2012-12-02 2017-12-02 2012-12-03 2017-12-03 2012-12-04 2017-12-04 2012-12-05 2017-12-05 2012-12-06 2017-12-06 2012-12-07 2017-12-07 2012-12-08 2017-12-08 2012-12-09 2017-12-09 2012-12-10 2017-12-10 2012-12-11 2017-12-11 2012-12-12 2017-12-12 2012-12-13 2017-12-13 2012-12-14 2017-12-14 2012-12-15 2017-12-15 2012-12-16 2017-12-16 2012-12-17 2017-12-17 2012-12-18 2017-12-18 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3793) Print number of fetched rows after query in CliDriver
Navis created HIVE-3793: --- Summary: Print number of fetched rows after query in CliDriver Key: HIVE-3793 URL: https://issues.apache.org/jira/browse/HIVE-3793 Project: Hive Issue Type: Improvement Components: CLI Reporter: Navis Assignee: Navis Priority: Trivial Currently shows time taken only. But it would be useful showing number of rows fetched also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3793) Print number of fetched rows after query in CliDriver
[ https://issues.apache.org/jira/browse/HIVE-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3793: -- Attachment: HIVE-3793.D7305.1.patch navis requested code review of HIVE-3793 [jira] Print number of fetched rows after query in CliDriver. Reviewers: JIRA DPAL-1942 Print number of fetched rows after query in CliDriver Currently shows time taken only. But it would be useful showing number of rows fetched also. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D7305 AFFECTED FILES cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/17403/ To: JIRA, navis Print number of fetched rows after query in CliDriver - Key: HIVE-3793 URL: https://issues.apache.org/jira/browse/HIVE-3793 Project: Hive Issue Type: Improvement Components: CLI Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3793.D7305.1.patch Currently shows time taken only. But it would be useful showing number of rows fetched also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3793) Print number of fetched rows after query in CliDriver
[ https://issues.apache.org/jira/browse/HIVE-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3793: Status: Patch Available (was: Open) Print number of fetched rows after query in CliDriver - Key: HIVE-3793 URL: https://issues.apache.org/jira/browse/HIVE-3793 Project: Hive Issue Type: Improvement Components: CLI Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3793.D7305.1.patch Currently shows time taken only. But it would be useful showing number of rows fetched also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3420) Inefficiency in hbase handler when process query including rowkey range scan
[ https://issues.apache.org/jira/browse/HIVE-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529641#comment-13529641 ] Navis commented on HIVE-3420: - @Gang Deng This is pretty important issue. I'll make a patch for a review. Inefficiency in hbase handler when process query including rowkey range scan Key: HIVE-3420 URL: https://issues.apache.org/jira/browse/HIVE-3420 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.9.0 Environment: Hive-0.9.0 + HBase-0.94.1 Reporter: Gang Deng Priority: Critical Original Estimate: 2h Remaining Estimate: 2h When query hive with hbase rowkey range, hive map tasks do not leverage startrow, endrow information in tablesplit. For example, if the rowkeys fit into 5 hbase files, then where will be 5 map tasks. Ideally, each task will process 1 file. But in current implementation, each task processes 5 files repeatedly. The behavior not only waste network bandwidth, but also worse the lock contention in HBase block cache as each task have to access the same block. The problem code is in HiveHBaseTableInputFormat.convertFilte as below: …… if (tableSplit != null) { tableSplit = new TableSplit( tableSplit.getTableName(), startRow, stopRow, tableSplit.getRegionLocation()); } scan.setStartRow(startRow); scan.setStopRow(stopRow); …… As tableSplit already include startRow, endRow information of file, the better implementation will be: …… byte[] splitStart = startRow; byte[] splitStop = stopRow; if (tableSplit != null) { if(tableSplit.getStartRow() != null){ splitStart = startRow.length == 0 || Bytes.compareTo(tableSplit.getStartRow(), startRow) = 0 ? tableSplit.getStartRow() : startRow; } if(tableSplit.getEndRow() != null){ splitStop = (stopRow.length == 0 || Bytes.compareTo(tableSplit.getEndRow(), stopRow) = 0) tableSplit.getEndRow().length 0 ? tableSplit.getEndRow() : stopRow; } tableSplit = new TableSplit( tableSplit.getTableName(), splitStart, splitStop, tableSplit.getRegionLocation()); } scan.setStartRow(splitStart); scan.setStopRow(splitStop); …… In my test, the changed code will improve performance more than 30%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3672) Support altering partition column type in Hive
[ https://issues.apache.org/jira/browse/HIVE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingwei Lu updated HIVE-3672: - Attachment: HIVE-3672.2.patch.txt New patch fixed according to comments. Support altering partition column type in Hive -- Key: HIVE-3672 URL: https://issues.apache.org/jira/browse/HIVE-3672 Project: Hive Issue Type: Improvement Components: CLI, SQL Affects Versions: 0.10.0 Reporter: Jingwei Lu Assignee: Jingwei Lu Fix For: 0.10.0 Attachments: HIVE-3672.1.patch.txt, HIVE-3672.2.patch.txt Original Estimate: 72h Remaining Estimate: 72h Currently, Hive does not allow altering partition column types. As we've discouraged users from using non-string partition column types, this presents a problem for users who want to change there partition columns to be strings, they have to rename their table, create a new table, and copy all the data over. To support this via the CLI, adding a command like ALTER TABLE table_name PARTITION COLUMN (column_name new type); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3420) Inefficiency in hbase handler when process query including rowkey range scan
[ https://issues.apache.org/jira/browse/HIVE-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3420: Affects Version/s: (was: 0.9.0) Status: Patch Available (was: Open) Inefficiency in hbase handler when process query including rowkey range scan Key: HIVE-3420 URL: https://issues.apache.org/jira/browse/HIVE-3420 Project: Hive Issue Type: Improvement Components: HBase Handler Environment: Hive-0.9.0 + HBase-0.94.1 Reporter: Gang Deng Priority: Critical Attachments: HIVE-3420.D7311.1.patch Original Estimate: 2h Remaining Estimate: 2h When query hive with hbase rowkey range, hive map tasks do not leverage startrow, endrow information in tablesplit. For example, if the rowkeys fit into 5 hbase files, then where will be 5 map tasks. Ideally, each task will process 1 file. But in current implementation, each task processes 5 files repeatedly. The behavior not only waste network bandwidth, but also worse the lock contention in HBase block cache as each task have to access the same block. The problem code is in HiveHBaseTableInputFormat.convertFilte as below: …… if (tableSplit != null) { tableSplit = new TableSplit( tableSplit.getTableName(), startRow, stopRow, tableSplit.getRegionLocation()); } scan.setStartRow(startRow); scan.setStopRow(stopRow); …… As tableSplit already include startRow, endRow information of file, the better implementation will be: …… byte[] splitStart = startRow; byte[] splitStop = stopRow; if (tableSplit != null) { if(tableSplit.getStartRow() != null){ splitStart = startRow.length == 0 || Bytes.compareTo(tableSplit.getStartRow(), startRow) = 0 ? tableSplit.getStartRow() : startRow; } if(tableSplit.getEndRow() != null){ splitStop = (stopRow.length == 0 || Bytes.compareTo(tableSplit.getEndRow(), stopRow) = 0) tableSplit.getEndRow().length 0 ? tableSplit.getEndRow() : stopRow; } tableSplit = new TableSplit( tableSplit.getTableName(), splitStart, splitStop, tableSplit.getRegionLocation()); } scan.setStartRow(splitStart); scan.setStopRow(splitStop); …… In my test, the changed code will improve performance more than 30%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3420) Inefficiency in hbase handler when process query including rowkey range scan
[ https://issues.apache.org/jira/browse/HIVE-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3420: -- Attachment: HIVE-3420.D7311.1.patch navis requested code review of HIVE-3420 [jira] Inefficiency in hbase handler when process query including rowkey range scan. Reviewers: JIRA DPAL-1943 Inefficiency in hbase handler when process query including rowkey range scan When query hive with hbase rowkey range, hive map tasks do not leverage startrow, endrow information in tablesplit. For example, if the rowkeys fit into 5 hbase files, then where will be 5 map tasks. Ideally, each task will process 1 file. But in current implementation, each task processes 5 files repeatedly. The behavior not only waste network bandwidth, but also worse the lock contention in HBase block cache as each task have to access the same block. The problem code is in HiveHBaseTableInputFormat.convertFilte as below: …… if (tableSplit != null) { tableSplit = new TableSplit( tableSplit.getTableName(), startRow, stopRow, tableSplit.getRegionLocation()); } scan.setStartRow(startRow); scan.setStopRow(stopRow); …… As tableSplit already include startRow, endRow information of file, the better implementation will be: …… byte[] splitStart = startRow; byte[] splitStop = stopRow; if (tableSplit != null) { if(tableSplit.getStartRow() != null) { splitStart = startRow.length == 0 || Bytes.compareTo(tableSplit.getStartRow(), startRow) = 0 ? tableSplit.getStartRow() : startRow; } if(tableSplit.getEndRow() != null) { splitStop = (stopRow.length == 0 || Bytes.compareTo(tableSplit.getEndRow(), stopRow) = 0) tableSplit.getEndRow().length 0 ? tableSplit.getEndRow() : stopRow; } tableSplit = new TableSplit( tableSplit.getTableName(), splitStart, splitStop, tableSplit.getRegionLocation()); } scan.setStartRow(splitStart); scan.setStopRow(splitStop); …… In my test, the changed code will improve performance more than 30%. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D7311 AFFECTED FILES hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/17415/ To: JIRA, navis Inefficiency in hbase handler when process query including rowkey range scan Key: HIVE-3420 URL: https://issues.apache.org/jira/browse/HIVE-3420 Project: Hive Issue Type: Improvement Components: HBase Handler Environment: Hive-0.9.0 + HBase-0.94.1 Reporter: Gang Deng Priority: Critical Attachments: HIVE-3420.D7311.1.patch Original Estimate: 2h Remaining Estimate: 2h When query hive with hbase rowkey range, hive map tasks do not leverage startrow, endrow information in tablesplit. For example, if the rowkeys fit into 5 hbase files, then where will be 5 map tasks. Ideally, each task will process 1 file. But in current implementation, each task processes 5 files repeatedly. The behavior not only waste network bandwidth, but also worse the lock contention in HBase block cache as each task have to access the same block. The problem code is in HiveHBaseTableInputFormat.convertFilte as below: …… if (tableSplit != null) { tableSplit = new TableSplit( tableSplit.getTableName(), startRow, stopRow, tableSplit.getRegionLocation()); } scan.setStartRow(startRow); scan.setStopRow(stopRow); …… As tableSplit already include startRow, endRow information of file, the better implementation will be: …… byte[] splitStart = startRow; byte[] splitStop = stopRow; if (tableSplit != null) { if(tableSplit.getStartRow() != null){ splitStart = startRow.length == 0 || Bytes.compareTo(tableSplit.getStartRow(), startRow) = 0 ? tableSplit.getStartRow() : startRow; } if(tableSplit.getEndRow() != null){ splitStop = (stopRow.length == 0 || Bytes.compareTo(tableSplit.getEndRow(), stopRow) = 0) tableSplit.getEndRow().length 0 ? tableSplit.getEndRow() : stopRow; } tableSplit = new TableSplit(