Re: --hiveconf vs -hiveconf
All occurrences of -hiveconf in the wiki have been changed to --hiveconf except for one new sentence in the CLI command line options https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli#LanguageManualCli-HiveCommandLineOptions section, which says it's also supported. The list of docs changed is in the first March 8th message in this thread. -- Lefty On Sat, Mar 8, 2014 at 11:55 PM, Lefty Leverenz leftylever...@gmail.com wrote: What's the difference between double-dash options and single-dash options? -- Lefty On Sat, Mar 8, 2014 at 9:40 AM, Edward Capriolo edlinuxg...@gmail.com wrote: Great thanks for following up. THere might be a number of etl processes in the wild saying -hiveconf which is why it is important to keep around for the cli at least. On Sat, Mar 8, 2014 at 1:56 AM, Xuefu Zhang xzh...@cloudera.com wrote: This is just getting more and more interesting. I never thought of -hiveconf option, and always assumed it was a typo of --hiveconf. (That's why I edited the one, which triggered the discovery.) I just checked and found that both work, which is out of my surprise. With this assumption, Beeline has implemented only --hiveconf to mimic CLI. As to the documentation, I think we can stick to --hiveconf from now on, since they are supported by both CLI and Beeline. However, -hiveconf will continue to work for CLI until its death. Thanks, Xuefu On Fri, Mar 7, 2014 at 10:36 PM, Lefty Leverenz leftylever...@gmail.com wrote: OK, so just one of the pages in wiki has changed, and hive behavior has not changed That's right, and a closer look at the wiki shows that all the examples are -hiveconf except the new change. The only place --hiveconf appears is in duplications of help messages for the hive command, the old Hive server, or Beeline. In a fresh export of the wiki --hiveconf occurs in these docs: - CLI https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli#LanguageManualCli-HiveCommandLineOptions repeats what hive -H says (--hiveconf) but gives 3 examples of -hiveconf. - Admin Config https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-ConfiguringHive says --hiveconf twice, in text and an example (both changed this week). - Hive Server https://cwiki.apache.org/confluence/display/Hive/HiveServer says --hiveconf once, but that's the Thrift server help message. - HiveServer2 Clients https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions says --hiveconf twice, but that's the Beeline option. These wikidocs say -hiveconf: - Getting Started (4 in config overview https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ConfigurationManagementOverview and 2 in error logs https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ErrorLogs ) - Avro SerDe https://cwiki.apache.org/confluence/display/Hive/AvroSerDe#AvroSerDe-SpecifyingtheAvroschemaforatable (2 in example and text) - Developer Guide https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide#DeveloperGuide-RunningHiveWithoutaHadoopCluster (4 in export HIVE_OPTS) - HBase Integration https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration#HBaseIntegration-Usage (2 in examples) - Variable Substitution https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VariableSubstitution (1 in the evil laugh example) - CLI (2 in one example https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli#LanguageManualCli-Examples , 1 in logging https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli#LanguageManualCli-Logging ) (My grep hits were inflated because -i caught HiveConf.) So what's it supposed to be? -- Lefty On Fri, Mar 7, 2014 at 11:06 PM, Thejas Nair the...@hortonworks.com wrote: OK, so just one of the pages in wiki has changed, and hive behavior has not changed ? (I have been using -hiveconf, but i haven't verified that with the tip of the trunk as of now). On Fri, Mar 7, 2014 at 6:19 PM, Xuefu Zhang xzh...@cloudera.com wrote: I didn't know that -hiveconf is supported. However, from hive -H, double dashes are seen. -h hostnameconnecting to Hive Server on remote host --hiveconf property=value Use value for given property --hivevar key=value Variable subsitution to apply to hive Thanks, Xuefu On Fri, Mar 7, 2014 at 6:00 PM, Edward Capriolo edlinuxg...@gmail.com wrote: I was not around when this change was made
[jira] [Assigned] (HIVE-7606) Design SparkSession, SparkSessionManager
[ https://issues.apache.org/jira/browse/HIVE-7606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venki Korukanti reassigned HIVE-7606: - Assignee: Venki Korukanti Design SparkSession, SparkSessionManager Key: HIVE-7606 URL: https://issues.apache.org/jira/browse/HIVE-7606 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Venki Korukanti In this JIRA we'll design two interfaces: * SparkSessionState * SparkSessionPoolManager and then once that is agreed upon we'll design two implementations: * SparkSessionStateImpl * SparkSessionPoolManagerImpl the form and function of these will be similar to the Tez equivalents. However, TezSessionState provides some implementation which SparkClient already provides (refreshLocalResources*). Let's keep SparkSessionState lightweight and not remove functionality from SparkClient. The scope of this jira is just to create the shells and basic functionality. The implementations in this jira should be able to: * Share a SparkSessionImpl across queries * Defining when a session can be re-used * Take ownership of SparkContext objects (Note we can only have a single SC until SPARK-2243 is resolved) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7593) Instantiate SparkClient per user session
[ https://issues.apache.org/jira/browse/HIVE-7593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-7593: --- Attachment: HIVE-7593-spark.patch Instantiate SparkClient per user session Key: HIVE-7593 URL: https://issues.apache.org/jira/browse/HIVE-7593 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chinna Rao Lalam Attachments: HIVE-7593-spark.patch SparkContext is the main class via which Hive talk to Spark cluster. SparkClient encapsulates a SparkContext instance. Currently all user sessions share a single SparkClient instance in HiveServer2. While this is good enough for a POC, even for our first two milestones, this is not desirable for a multi-tenancy environment and gives least flexibility to Hive users. Here is what we propose: 1. Have a SparkClient instance per user session. The SparkClient instance is created when user executes its first query in the session. It will get destroyed when user session ends. 2. The SparkClient is instantiated based on the spark configurations that are available to the user, including those defined at the global level and those overwritten by the user (thru set command, for instance). 3. Ideally, when user changes any spark configuration during the session, the old SparkClient instance should be destroyed and a new one based on the new configurations is created. This may turn out to be a little hard, and thus it's a nice-to-have. If not implemented, we need to document that subsequent configuration changes will not take effect in the current session. Please note that there is a thread-safety issue on Spark side where multiple SparkContext instances cannot coexist in the same JVM (SPARK-2243). We need to work with Spark community to get this addressed. Besides above functional requirements, avoid potential issues is also a consideration. For instance, sharing SC among users is bad, as resources (such as jar for UDF) will be also shared, which is problematic. On the other hand, one SC per job seems too expensive, as the resource needs to be re-rendered even there isn't any change. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6959) Enable Constant propagation optimizer for Hive Vectorization
[ https://issues.apache.org/jira/browse/HIVE-6959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092194#comment-14092194 ] Ashutosh Chauhan commented on HIVE-6959: +1 Enable Constant propagation optimizer for Hive Vectorization Key: HIVE-6959 URL: https://issues.apache.org/jira/browse/HIVE-6959 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-6959.1.patch, HIVE-6959.2.patch, HIVE-6959.4.patch, HIVE-6959.5.patch, HIVE-6959.6.patch HIVE-5771 covers Constant propagation optimizer for Hive. Now that HIVE-5771 is committed, we should remove any vectorization related code which duplicates this feature. For example, a fn to be cleaned is VectorizarionContext::foldConstantsForUnaryExprs(). In addition to this change, constant propagation should kick in when vectorization is enabled. i.e. we need to lift the HIVE_VECTORIZATION_ENABLED restriction inside ConstantPropagate::transform(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7506) MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)
[ https://issues.apache.org/jira/browse/HIVE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengcheng xiong updated HIVE-7506: -- Attachment: HIVE-7506.5.patch new patch after addressing all Lars and Ashutosh's comments. MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table) -- Key: HIVE-7506 URL: https://issues.apache.org/jira/browse/HIVE-7506 Project: Hive Issue Type: New Feature Components: Statistics Reporter: pengcheng xiong Assignee: pengcheng xiong Priority: Minor Attachments: HIVE-7506.1.patch, HIVE-7506.3.patch, HIVE-7506.4.patch, HIVE-7506.5.patch, HIVE-7506.patch Original Estimate: 252h Remaining Estimate: 252h Two motivations: (1) Cost-based Optimizer (CBO) depends heavily on the statistics of a column in a table (or a partition of a table). If we would like to test whether CBO chooses the best plan under different statistics, it would be time consuming if we load the whole table and create the statistics from ground up. (2) As database runs, the statistics of a column in a table (or a partition of a table) may change. We need a way or a mechanism to synchronize. We propose the following command to achieve that: ALTER TABLE table_name PARTITION partition_spec [COLUMN col_name] UPDATE STATISTICS col_statistics [COMMENT col_comment] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7647) Beeline does not honor --headerInterval and --color when executing with -e
[ https://issues.apache.org/jira/browse/HIVE-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092252#comment-14092252 ] Xuefu Zhang commented on HIVE-7647: --- [~ngangam] The change seems simple and reasonable. but could you please check code history to find out how that few lines got in for the first place. I just wanted to make sure that we do introduce regressions on this. Beeline does not honor --headerInterval and --color when executing with -e Key: HIVE-7647 URL: https://issues.apache.org/jira/browse/HIVE-7647 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.14.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7647.1.patch --showHeader is being honored [root@localhost ~]# beeline --showHeader=false -u 'jdbc:hive2://localhost:1/default' -n hive -d org.apache.hive.jdbc.HiveDriver -e select * from sample_07 limit 10; Connecting to jdbc:hive2://localhost:1/default Connected to: Apache Hive (version 0.12.0-cdh5.0.1) Driver: Hive JDBC (version 0.12.0-cdh5.0.1) Transaction isolation: TRANSACTION_REPEATABLE_READ -hiveconf (No such file or directory) +--+--++-+ | 00- | All Occupations | 135185230 | 42270 | | 11- | Management occupations | 6152650| 100310 | | 11-1011 | Chief executives | 301930 | 160440 | | 11-1021 | General and operations managers | 1697690| 107970 | | 11-1031 | Legislators | 64650 | 37980 | | 11-2011 | Advertising and promotions managers | 36100 | 94720 | | 11-2021 | Marketing managers | 166790 | 118160 | | 11-2022 | Sales managers | 333910 | 110390 | | 11-2031 | Public relations managers| 51730 | 101220 | | 11-3011 | Administrative services managers | 246930 | 79500 | +--+--++-+ 10 rows selected (0.838 seconds) Beeline version 0.12.0-cdh5.1.0 by Apache Hive Closing: org.apache.hive.jdbc.HiveConnection --outputFormat is being honored. [root@localhost ~]# beeline --outputFormat=csv -u 'jdbc:hive2://localhost:1/default' -n hive -d org.apache.hive.jdbc.HiveDriver -e select * from sample_07 limit 10; Connecting to jdbc:hive2://localhost:1/default Connected to: Apache Hive (version 0.12.0-cdh5.0.1) Driver: Hive JDBC (version 0.12.0-cdh5.0.1) Transaction isolation: TRANSACTION_REPEATABLE_READ 'code','description','total_emp','salary' '00-','All Occupations','135185230','42270' '11-','Management occupations','6152650','100310' '11-1011','Chief executives','301930','160440' '11-1021','General and operations managers','1697690','107970' '11-1031','Legislators','64650','37980' '11-2011','Advertising and promotions managers','36100','94720' '11-2021','Marketing managers','166790','118160' '11-2022','Sales managers','333910','110390' '11-2031','Public relations managers','51730','101220' '11-3011','Administrative services managers','246930','79500' 10 rows selected (0.664 seconds) Beeline version 0.12.0-cdh5.1.0 by Apache Hive Closing: org.apache.hive.jdbc.HiveConnection both --color --headerInterval are being honored when executing using -f option (reads query from a file rather than the commandline) (cannot really see the color here but use the terminal colors) [root@localhost ~]# beeline --showheader=true --color=true --headerInterval=5 -u 'jdbc:hive2://localhost:1/default' -n hive -d org.apache.hive.jdbc.HiveDriver -f /tmp/tmp.sql Connecting to jdbc:hive2://localhost:1/default Connected to: Apache Hive (version 0.12.0-cdh5.0.1) Driver: Hive JDBC (version 0.12.0-cdh5.0.1) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.12.0-cdh5.1.0 by Apache Hive 0: jdbc:hive2://localhost select * from sample_07 limit 8; +--+--++-+ | code | description | total_emp | salary | +--+--++-+ | 00- | All Occupations | 135185230 | 42270 | | 11- | Management occupations | 6152650| 100310 | | 11-1011 | Chief executives | 301930 | 160440 | | 11-1021 | General and operations managers | 1697690| 107970 | | 11-1031 | Legislators | 64650 | 37980 | +--+--++-+ |
[jira] [Commented] (HIVE-7452) Boolean comparison is done through reference equality rather than using equals
[ https://issues.apache.org/jira/browse/HIVE-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092283#comment-14092283 ] KangHS commented on HIVE-7452: -- Thanks, Ashutosh Chauhan :) Boolean comparison is done through reference equality rather than using equals -- Key: HIVE-7452 URL: https://issues.apache.org/jira/browse/HIVE-7452 Project: Hive Issue Type: Bug Reporter: Ted Yu Assignee: KangHS Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7452.patch In Driver#doAuthorization(): {code} if (tbl != null !tableAuthChecked.contains(tbl.getTableName()) !(tableUsePartLevelAuth.get(tbl.getTableName()) == Boolean.TRUE)) { {code} The above comparison should be done using .equals() method. The comparison below doesn't evaluate to true: {code} Boolean b = new Boolean(true); if (b == Boolean.TRUE) { {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7600) ConstantPropagateProcFactory uses reference equality on Boolean
[ https://issues.apache.org/jira/browse/HIVE-7600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092284#comment-14092284 ] KangHS commented on HIVE-7600: -- Thanks, Ashutosh Chauhan :) ConstantPropagateProcFactory uses reference equality on Boolean --- Key: HIVE-7600 URL: https://issues.apache.org/jira/browse/HIVE-7600 Project: Hive Issue Type: Bug Reporter: Ted Yu Assignee: KangHS Fix For: 0.14.0 Attachments: HIVE-7600.patch shortcutFunction() has the following code: {code} if (c.getValue() == Boolean.FALSE) { {code} Boolean.FALSE.equals() should be used. There're a few other occurrences of using reference equality on Boolean in this class. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23674: Handle db qualified names consistently across all HiveQL statements
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23674/ --- (Updated Aug. 11, 2014, 12:53 a.m.) Review request for hive and Thejas Nair. Changes --- Supports renaming table cross database. Bugs: HIVE-4064 https://issues.apache.org/jira/browse/HIVE-4064 Repository: hive-git Description --- Hive doesn't consistently handle db qualified names across all HiveQL statements. While some HiveQL statements such as SELECT support DB qualified names, other such as CREATE INDEX doesn't. Diffs (updated) - itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java c91b15c itests/util/src/main/java/org/apache/hadoop/hive/ql/hooks/CheckColumnAccessHook.java 14fc430 metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java ea866c5 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 6e689d0 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 5a56ced metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 760777a metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 74b1432 ql/src/java/org/apache/hadoop/hive/ql/Driver.java ea6ddbf ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 376e040 ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d22b1f6 ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 39b032e ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java 2e32fee ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java 989d0b5 ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 22945e3 ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnAccessInfo.java 939dc65 ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 67a3aa7 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ab1188a ql/src/java/org/apache/hadoop/hive/ql/parse/IndexUpdater.java 856ec2f ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7b86414 ql/src/java/org/apache/hadoop/hive/ql/parse/authorization/HiveAuthorizationTaskFactoryImpl.java 826bdf3 ql/src/java/org/apache/hadoop/hive/ql/plan/AlterIndexDesc.java 0318e4b ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableAlterPartDesc.java cf67e16 ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableSimpleDesc.java 541675c ql/src/java/org/apache/hadoop/hive/ql/plan/PrivilegeObjectDesc.java 9417220 ql/src/java/org/apache/hadoop/hive/ql/plan/RenamePartitionDesc.java 1b5fb9e ql/src/java/org/apache/hadoop/hive/ql/plan/ShowColumnsDesc.java fe6a91e ql/src/java/org/apache/hadoop/hive/ql/plan/ShowGrantDesc.java aa88153 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java 5c94217 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeObject.java 9e9ef71 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveV1Authorizer.java fbc0090 ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 98c2924 ql/src/test/org/apache/hadoop/hive/ql/parse/TestQBCompact.java 5f32d5f ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/PrivilegesTestBase.java 93901ec ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestHiveAuthorizationTaskFactory.java ab0d80e ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestPrivilegesV1.java fd827ad ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestPrivilegesV2.java 9499986 ql/src/test/queries/clientpositive/alter_rename_table.q PRE-CREATION ql/src/test/results/clientnegative/alter_concatenate_indexed_table.q.out 500d45d ql/src/test/results/clientnegative/alter_view_failure6.q.out cfbaca8 ql/src/test/results/clientnegative/merge_negative_1.q.out 95f6678 ql/src/test/results/clientnegative/merge_negative_2.q.out b3422e1 ql/src/test/results/clientnegative/show_columns3.q.out 09068b7 ql/src/test/results/clientnegative/show_tableproperties1.q.out ca54088 ql/src/test/results/clientnegative/temp_table_index.q.out 8ec5c0a ql/src/test/results/clientpositive/alter_rename_table.q.out PRE-CREATION ql/src/test/results/clientpositive/drop_multi_partitions.q.out eae57f3 ql/src/test/results/clientpositive/input3.q.out 547449c ql/src/test/results/clientpositive/insert2_overwrite_partitions.q.out 21bd257 ql/src/test/results/clientpositive/show_create_table_db_table.q.out 0119471 ql/src/test/results/clientpositive/show_tblproperties.q.out 80db5e4 ql/src/test/results/clientpositive/temp_table_names.q.out 940684c ql/src/test/results/clientpositive/temp_table_precedence.q.out 1075b2c Diff: https://reviews.apache.org/r/23674/diff/ Testing --- Thanks, Navis Ryu
[jira] [Updated] (HIVE-4064) Handle db qualified names consistently across all HiveQL statements
[ https://issues.apache.org/jira/browse/HIVE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4064: Attachment: HIVE-4064.7.patch.txt Handle db qualified names consistently across all HiveQL statements --- Key: HIVE-4064 URL: https://issues.apache.org/jira/browse/HIVE-4064 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Navis Attachments: HIVE-4064-1.patch, HIVE-4064.1.patch.txt, HIVE-4064.2.patch.txt, HIVE-4064.3.patch.txt, HIVE-4064.4.patch.txt, HIVE-4064.5.patch.txt, HIVE-4064.6.patch.txt, HIVE-4064.7.patch.txt Hive doesn't consistently handle db qualified names across all HiveQL statements. While some HiveQL statements such as SELECT support DB qualified names, other such as CREATE INDEX doesn't. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7205) Wrong results when union all of grouping followed by group by with correlation optimization
[ https://issues.apache.org/jira/browse/HIVE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092308#comment-14092308 ] Navis commented on HIVE-7205: - [~ashutoshc] IMO, [~yhuai] is original author of this optimizer and should get a chance to look into this. This is correctness issue but can be simply work-around by disabling the optimizer (which is disabled, by default). Wrong results when union all of grouping followed by group by with correlation optimization --- Key: HIVE-7205 URL: https://issues.apache.org/jira/browse/HIVE-7205 Project: Hive Issue Type: Bug Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: dima machlin Assignee: Navis Priority: Critical Attachments: HIVE-7205.1.patch.txt, HIVE-7205.2.patch.txt, HIVE-7205.3.patch.txt use case : table TBL (a string,b string) contains single row : 'a','a' the following query : {code:sql} select b, sum(cc) from ( select b,count(1) as cc from TBL group by b union all select a as b,count(1) as cc from TBL group by a ) z group by b {code} returns a 1 a 1 while set hive.optimize.correlation=true; if we change set hive.optimize.correlation=false; it returns correct results : a 2 The plan with correlation optimization : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL b (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a) b) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL a) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_TABLE_OR_COL cc (TOK_GROUPBY (TOK_TABLE_OR_COL b STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias - Map Operator Tree: null-subquery1:z-subquery1:TBL TableScan alias: TBL Select Operator expressions: expr: b type: string outputColumnNames: b Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: b type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col1 type: bigint null-subquery2:z-subquery2:TBL TableScan alias: TBL Select Operator expressions: expr: a type: string outputColumnNames: a Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: a type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Demux Operator Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint
[jira] [Updated] (HIVE-6847) Improve / fix bugs in Hive scratch dir setup
[ https://issues.apache.org/jira/browse/HIVE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-6847: --- Attachment: HIVE-6847.2.patch Updated patch for review. cc [~szehon] [~jdere] Improve / fix bugs in Hive scratch dir setup Key: HIVE-6847 URL: https://issues.apache.org/jira/browse/HIVE-6847 Project: Hive Issue Type: Bug Components: CLI, HiveServer2 Affects Versions: 0.14.0 Reporter: Vikram Dixit K Assignee: Vaibhav Gumashta Fix For: 0.14.0 Attachments: HIVE-6847.1.patch, HIVE-6847.2.patch Currently, the hive server creates scratch directory and changes permission to 777 however, this is not great with respect to security. We need to create user specific scratch directories instead. Also refer to HIVE-6782 1st iteration of the patch for approach. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7638) Disallow CREATE VIEW when created with a temporary table
[ https://issues.apache.org/jira/browse/HIVE-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092312#comment-14092312 ] Navis commented on HIVE-7638: - Couldn't we introduce temporary view? Disallow CREATE VIEW when created with a temporary table Key: HIVE-7638 URL: https://issues.apache.org/jira/browse/HIVE-7638 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-7638.1.patch Followup item from HIVE-7090, don't allow view to be created if the view definition has a temp table. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7592) List Jars or Files are not supported by Beeline
[ https://issues.apache.org/jira/browse/HIVE-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7592: Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks Szehon, for the review. List Jars or Files are not supported by Beeline --- Key: HIVE-7592 URL: https://issues.apache.org/jira/browse/HIVE-7592 Project: Hive Issue Type: Bug Components: CLI Reporter: Ferdinand Xu Assignee: Navis Fix For: 0.14.0 Attachments: HIVE-7592.1.patch.txt Through adding jars or files are supported by Beeline, List jars or Files are still not supported. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5718) Support direct fetch for lateral views, sub queries, etc.
[ https://issues.apache.org/jira/browse/HIVE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092317#comment-14092317 ] Navis commented on HIVE-5718: - tez_join_hash.q looks suspicious but cannot reproduce. Others seemed not related. Support direct fetch for lateral views, sub queries, etc. - Key: HIVE-5718 URL: https://issues.apache.org/jira/browse/HIVE-5718 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D13857.1.patch, D13857.2.patch, D13857.3.patch, HIVE-5718.4.patch.txt, HIVE-5718.5.patch.txt, HIVE-5718.6.patch.txt, HIVE-5718.7.patch.txt, HIVE-5718.8.patch.txt Extend HIVE-2925 with LV and SubQ. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7390) Make quote character optional and configurable in BeeLine CSV/TSV output
[ https://issues.apache.org/jira/browse/HIVE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-7390: --- Attachment: HIVE-7390.9.patch Latest version for Lars comments Make quote character optional and configurable in BeeLine CSV/TSV output Key: HIVE-7390 URL: https://issues.apache.org/jira/browse/HIVE-7390 Project: Hive Issue Type: New Feature Components: Clients Affects Versions: 0.13.1 Reporter: Jim Halfpenny Assignee: Ferdinand Xu Attachments: HIVE-7390.1.patch, HIVE-7390.2.patch, HIVE-7390.3.patch, HIVE-7390.4.patch, HIVE-7390.5.patch, HIVE-7390.6.patch, HIVE-7390.7.patch, HIVE-7390.8.patch, HIVE-7390.9.patch, HIVE-7390.patch Currently when either the CSV or TSV output formats are used in beeline each column is wrapped in single quotes. Quote wrapping of columns should be optional and the user should be able to choose the character used to wrap the columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7647) Beeline does not honor --headerInterval and --color when executing with -e
[ https://issues.apache.org/jira/browse/HIVE-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092359#comment-14092359 ] Naveen Gangam commented on HIVE-7647: - [~xuefuz] I checked the history it was part of the checkin for HS2 functionality. Based on the note above from [~ashutoshc] this code may have been copied from SQLLine. So it may have been un-intentional. Thanks Beeline does not honor --headerInterval and --color when executing with -e Key: HIVE-7647 URL: https://issues.apache.org/jira/browse/HIVE-7647 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.14.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7647.1.patch --showHeader is being honored [root@localhost ~]# beeline --showHeader=false -u 'jdbc:hive2://localhost:1/default' -n hive -d org.apache.hive.jdbc.HiveDriver -e select * from sample_07 limit 10; Connecting to jdbc:hive2://localhost:1/default Connected to: Apache Hive (version 0.12.0-cdh5.0.1) Driver: Hive JDBC (version 0.12.0-cdh5.0.1) Transaction isolation: TRANSACTION_REPEATABLE_READ -hiveconf (No such file or directory) +--+--++-+ | 00- | All Occupations | 135185230 | 42270 | | 11- | Management occupations | 6152650| 100310 | | 11-1011 | Chief executives | 301930 | 160440 | | 11-1021 | General and operations managers | 1697690| 107970 | | 11-1031 | Legislators | 64650 | 37980 | | 11-2011 | Advertising and promotions managers | 36100 | 94720 | | 11-2021 | Marketing managers | 166790 | 118160 | | 11-2022 | Sales managers | 333910 | 110390 | | 11-2031 | Public relations managers| 51730 | 101220 | | 11-3011 | Administrative services managers | 246930 | 79500 | +--+--++-+ 10 rows selected (0.838 seconds) Beeline version 0.12.0-cdh5.1.0 by Apache Hive Closing: org.apache.hive.jdbc.HiveConnection --outputFormat is being honored. [root@localhost ~]# beeline --outputFormat=csv -u 'jdbc:hive2://localhost:1/default' -n hive -d org.apache.hive.jdbc.HiveDriver -e select * from sample_07 limit 10; Connecting to jdbc:hive2://localhost:1/default Connected to: Apache Hive (version 0.12.0-cdh5.0.1) Driver: Hive JDBC (version 0.12.0-cdh5.0.1) Transaction isolation: TRANSACTION_REPEATABLE_READ 'code','description','total_emp','salary' '00-','All Occupations','135185230','42270' '11-','Management occupations','6152650','100310' '11-1011','Chief executives','301930','160440' '11-1021','General and operations managers','1697690','107970' '11-1031','Legislators','64650','37980' '11-2011','Advertising and promotions managers','36100','94720' '11-2021','Marketing managers','166790','118160' '11-2022','Sales managers','333910','110390' '11-2031','Public relations managers','51730','101220' '11-3011','Administrative services managers','246930','79500' 10 rows selected (0.664 seconds) Beeline version 0.12.0-cdh5.1.0 by Apache Hive Closing: org.apache.hive.jdbc.HiveConnection both --color --headerInterval are being honored when executing using -f option (reads query from a file rather than the commandline) (cannot really see the color here but use the terminal colors) [root@localhost ~]# beeline --showheader=true --color=true --headerInterval=5 -u 'jdbc:hive2://localhost:1/default' -n hive -d org.apache.hive.jdbc.HiveDriver -f /tmp/tmp.sql Connecting to jdbc:hive2://localhost:1/default Connected to: Apache Hive (version 0.12.0-cdh5.0.1) Driver: Hive JDBC (version 0.12.0-cdh5.0.1) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.12.0-cdh5.1.0 by Apache Hive 0: jdbc:hive2://localhost select * from sample_07 limit 8; +--+--++-+ | code | description | total_emp | salary | +--+--++-+ | 00- | All Occupations | 135185230 | 42270 | | 11- | Management occupations | 6152650| 100310 | | 11-1011 | Chief executives | 301930 | 160440 | | 11-1021 | General and operations managers | 1697690| 107970 | | 11-1031 | Legislators | 64650 | 37980 | +--+--++-+ | code
[jira] [Updated] (HIVE-6959) Enable Constant propagation optimizer for Hive Vectorization
[ https://issues.apache.org/jira/browse/HIVE-6959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6959: --- Component/s: Vectorization Enable Constant propagation optimizer for Hive Vectorization Key: HIVE-6959 URL: https://issues.apache.org/jira/browse/HIVE-6959 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.14.0 Attachments: HIVE-6959.1.patch, HIVE-6959.2.patch, HIVE-6959.4.patch, HIVE-6959.5.patch, HIVE-6959.6.patch HIVE-5771 covers Constant propagation optimizer for Hive. Now that HIVE-5771 is committed, we should remove any vectorization related code which duplicates this feature. For example, a fn to be cleaned is VectorizarionContext::foldConstantsForUnaryExprs(). In addition to this change, constant propagation should kick in when vectorization is enabled. i.e. we need to lift the HIVE_VECTORIZATION_ENABLED restriction inside ConstantPropagate::transform(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6959) Enable Constant propagation optimizer for Hive Vectorization
[ https://issues.apache.org/jira/browse/HIVE-6959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6959: --- Affects Version/s: 0.14.0 Enable Constant propagation optimizer for Hive Vectorization Key: HIVE-6959 URL: https://issues.apache.org/jira/browse/HIVE-6959 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.14.0 Attachments: HIVE-6959.1.patch, HIVE-6959.2.patch, HIVE-6959.4.patch, HIVE-6959.5.patch, HIVE-6959.6.patch HIVE-5771 covers Constant propagation optimizer for Hive. Now that HIVE-5771 is committed, we should remove any vectorization related code which duplicates this feature. For example, a fn to be cleaned is VectorizarionContext::foldConstantsForUnaryExprs(). In addition to this change, constant propagation should kick in when vectorization is enabled. i.e. we need to lift the HIVE_VECTORIZATION_ENABLED restriction inside ConstantPropagate::transform(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6959) Enable Constant propagation optimizer for Hive Vectorization
[ https://issues.apache.org/jira/browse/HIVE-6959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6959: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Hari! Enable Constant propagation optimizer for Hive Vectorization Key: HIVE-6959 URL: https://issues.apache.org/jira/browse/HIVE-6959 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.14.0 Attachments: HIVE-6959.1.patch, HIVE-6959.2.patch, HIVE-6959.4.patch, HIVE-6959.5.patch, HIVE-6959.6.patch HIVE-5771 covers Constant propagation optimizer for Hive. Now that HIVE-5771 is committed, we should remove any vectorization related code which duplicates this feature. For example, a fn to be cleaned is VectorizarionContext::foldConstantsForUnaryExprs(). In addition to this change, constant propagation should kick in when vectorization is enabled. i.e. we need to lift the HIVE_VECTORIZATION_ENABLED restriction inside ConstantPropagate::transform(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7521) Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process()
[ https://issues.apache.org/jira/browse/HIVE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7521: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process() Key: HIVE-7521 URL: https://issues.apache.org/jira/browse/HIVE-7521 Project: Hive Issue Type: Bug Components: Physical Optimizer Reporter: Ted Yu Assignee: KangHS Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7521.patch {code} ExprNodeConstantDesc c = (ExprNodeConstantDesc) condition; if (c.getValue() != Boolean.FALSE) { return null; } {code} equals() should be called instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7521) Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process()
[ https://issues.apache.org/jira/browse/HIVE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7521: --- Component/s: Physical Optimizer Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process() Key: HIVE-7521 URL: https://issues.apache.org/jira/browse/HIVE-7521 Project: Hive Issue Type: Bug Components: Physical Optimizer Reporter: Ted Yu Assignee: KangHS Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7521.patch {code} ExprNodeConstantDesc c = (ExprNodeConstantDesc) condition; if (c.getValue() != Boolean.FALSE) { return null; } {code} equals() should be called instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7521) Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process()
[ https://issues.apache.org/jira/browse/HIVE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7521: --- Affects Version/s: 0.14.0 Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process() Key: HIVE-7521 URL: https://issues.apache.org/jira/browse/HIVE-7521 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Ted Yu Assignee: KangHS Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7521.patch {code} ExprNodeConstantDesc c = (ExprNodeConstantDesc) condition; if (c.getValue() != Boolean.FALSE) { return null; } {code} equals() should be called instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7521) Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process()
[ https://issues.apache.org/jira/browse/HIVE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092382#comment-14092382 ] Ashutosh Chauhan commented on HIVE-7521: Committed to trunk. Thanks, Kang HS! Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process() Key: HIVE-7521 URL: https://issues.apache.org/jira/browse/HIVE-7521 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Ted Yu Assignee: KangHS Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7521.patch {code} ExprNodeConstantDesc c = (ExprNodeConstantDesc) condition; if (c.getValue() != Boolean.FALSE) { return null; } {code} equals() should be called instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7419) Missing break in SemanticAnalyzer#getTableDescFromSerDe()
[ https://issues.apache.org/jira/browse/HIVE-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7419: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Navis! Missing break in SemanticAnalyzer#getTableDescFromSerDe() - Key: HIVE-7419 URL: https://issues.apache.org/jira/browse/HIVE-7419 Project: Hive Issue Type: Bug Reporter: Ted Yu Assignee: Navis Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7419.1.patch.txt {code} case HiveParser.TOK_TABLEROWFORMATLINES: String lineDelim = unescapeSQLString(rowChild.getChild(0).getText()); tblDesc.getProperties().setProperty(serdeConstants.LINE_DELIM, lineDelim); if (!lineDelim.equals(\n) !lineDelim.equals(10)) { throw new SemanticException(generateErrorMessage(rowChild, ErrorMsg.LINES_TERMINATED_BY_NON_NEWLINE.getMsg())); } case HiveParser.TOK_TABLEROWFORMATNULL: String nullFormat = unescapeSQLString(rowChild.getChild(0).getText()); {code} break seems to be missing for TOK_TABLEROWFORMATLINES case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7419) Missing break in SemanticAnalyzer#getTableDescFromSerDe()
[ https://issues.apache.org/jira/browse/HIVE-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7419: --- Assignee: Navis Missing break in SemanticAnalyzer#getTableDescFromSerDe() - Key: HIVE-7419 URL: https://issues.apache.org/jira/browse/HIVE-7419 Project: Hive Issue Type: Bug Reporter: Ted Yu Assignee: Navis Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7419.1.patch.txt {code} case HiveParser.TOK_TABLEROWFORMATLINES: String lineDelim = unescapeSQLString(rowChild.getChild(0).getText()); tblDesc.getProperties().setProperty(serdeConstants.LINE_DELIM, lineDelim); if (!lineDelim.equals(\n) !lineDelim.equals(10)) { throw new SemanticException(generateErrorMessage(rowChild, ErrorMsg.LINES_TERMINATED_BY_NON_NEWLINE.getMsg())); } case HiveParser.TOK_TABLEROWFORMATNULL: String nullFormat = unescapeSQLString(rowChild.getChild(0).getText()); {code} break seems to be missing for TOK_TABLEROWFORMATLINES case. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24467: HIVE-7373: Hive should not remove trailing zeros for decimal numbers
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24467/ --- (Updated Aug. 11, 2014, 3:23 a.m.) Review request for hive. Changes --- Fix additional queries tests. Bugs: HIVE-7373 https://issues.apache.org/jira/browse/HIVE-7373 Repository: hive-git Description --- Removes trim() call from HiveDecimal normalize/enforcePrecisionScale methods. This change affects the Decimal128 getHiveDecimalString() method; so a new 'actualScale' variable is used that stores the actual scale of a value passed to Decimal128. The rest of the changes are added to fix decimal query tests to match the new HiveDecimal value. Diffs (updated) - common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java d4cc32d common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java ad09015 common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 46236a5 common/src/test/org/apache/hadoop/hive/common/type/TestHiveDecimal.java 1384a45 data/files/kv10.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java f5023bb ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java 2a871c5 ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java b1524f7 ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFOPDivide.java 4c5b3a5 ql/src/test/queries/clientpositive/decimal_trailing.q PRE-CREATION ql/src/test/queries/clientpositive/literal_decimal.q 08b21dc ql/src/test/results/clientpositive/avro_decimal.q.out 1868de3 ql/src/test/results/clientpositive/avro_decimal_native.q.out bc87a7d ql/src/test/results/clientpositive/char_pad_convert.q.out 1f81426 ql/src/test/results/clientpositive/compute_stats_decimal.q.out 2a65efe ql/src/test/results/clientpositive/decimal_2.q.out 794bad0 ql/src/test/results/clientpositive/decimal_3.q.out 524fa62 ql/src/test/results/clientpositive/decimal_4.q.out 7444e83 ql/src/test/results/clientpositive/decimal_5.q.out 52dae22 ql/src/test/results/clientpositive/decimal_6.q.out 4338b52 ql/src/test/results/clientpositive/decimal_precision.q.out ea08b73 ql/src/test/results/clientpositive/decimal_trailing.q.out PRE-CREATION ql/src/test/results/clientpositive/decimal_udf.q.out 02a0caa ql/src/test/results/clientpositive/literal_decimal.q.out 2f2df6a ql/src/test/results/clientpositive/orc_predicate_pushdown.q.out 890cb2c ql/src/test/results/clientpositive/parquet_decimal.q.out b2d542f ql/src/test/results/clientpositive/parquet_decimal1.q.out 9ff0950 ql/src/test/results/clientpositive/serde_regex.q.out e231a09 ql/src/test/results/clientpositive/tez/vector_data_types.q.out 4954825 ql/src/test/results/clientpositive/tez/vector_decimal_aggregate.q.out 437e830 ql/src/test/results/clientpositive/udf_case.q.out 6c186bd ql/src/test/results/clientpositive/udf_when.q.out cbb1210 ql/src/test/results/clientpositive/vector_between_in.q.out 78e340b ql/src/test/results/clientpositive/vector_data_types.q.out 007f4e8 ql/src/test/results/clientpositive/vector_decimal_aggregate.q.out 2c4d552 ql/src/test/results/clientpositive/vector_decimal_cast.q.out a508732 ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 094eb8e ql/src/test/results/clientpositive/vector_decimal_mapjoin.q.out 71a3def ql/src/test/results/clientpositive/vector_decimal_math_funcs.q.out 717e81a ql/src/test/results/clientpositive/windowing_decimal.q.out 88d11af ql/src/test/results/clientpositive/windowing_navfn.q.out 95d7942 ql/src/test/results/clientpositive/windowing_rank.q.out 9976fdb serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java 523ad7d Diff: https://reviews.apache.org/r/24467/diff/ Testing --- Thanks, Sergio Pena
[jira] [Created] (HIVE-7672) Potential resource leak in EximUtil#createExportDump()
Ted Yu created HIVE-7672: Summary: Potential resource leak in EximUtil#createExportDump() Key: HIVE-7672 URL: https://issues.apache.org/jira/browse/HIVE-7672 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor Here is related code: {code} OutputStream out = fs.create(metadataPath); out.write(jsonContainer.toString().getBytes(UTF-8)); out.close(); {code} If out.write() throws exception, out would be left unclosed. out.close() should be enclosed in finally block. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23674: Handle db qualified names consistently across all HiveQL statements
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23674/#review50132 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java https://reviews.apache.org/r/23674/#comment87707 which in the form should be which is in the form ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java https://reviews.apache.org/r/23674/#comment87708 which in the form should be which is in the form ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java https://reviews.apache.org/r/23674/#comment87715 should does contain database name be does not contain database name? - Lefty Leverenz On Aug. 11, 2014, 12:53 a.m., Navis Ryu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23674/ --- (Updated Aug. 11, 2014, 12:53 a.m.) Review request for hive and Thejas Nair. Bugs: HIVE-4064 https://issues.apache.org/jira/browse/HIVE-4064 Repository: hive-git Description --- Hive doesn't consistently handle db qualified names across all HiveQL statements. While some HiveQL statements such as SELECT support DB qualified names, other such as CREATE INDEX doesn't. Diffs - itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java c91b15c itests/util/src/main/java/org/apache/hadoop/hive/ql/hooks/CheckColumnAccessHook.java 14fc430 metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java ea866c5 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 6e689d0 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 5a56ced metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 760777a metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 74b1432 ql/src/java/org/apache/hadoop/hive/ql/Driver.java ea6ddbf ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 376e040 ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d22b1f6 ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 39b032e ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java 2e32fee ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java 989d0b5 ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 22945e3 ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnAccessInfo.java 939dc65 ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 67a3aa7 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ab1188a ql/src/java/org/apache/hadoop/hive/ql/parse/IndexUpdater.java 856ec2f ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7b86414 ql/src/java/org/apache/hadoop/hive/ql/parse/authorization/HiveAuthorizationTaskFactoryImpl.java 826bdf3 ql/src/java/org/apache/hadoop/hive/ql/plan/AlterIndexDesc.java 0318e4b ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableAlterPartDesc.java cf67e16 ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableSimpleDesc.java 541675c ql/src/java/org/apache/hadoop/hive/ql/plan/PrivilegeObjectDesc.java 9417220 ql/src/java/org/apache/hadoop/hive/ql/plan/RenamePartitionDesc.java 1b5fb9e ql/src/java/org/apache/hadoop/hive/ql/plan/ShowColumnsDesc.java fe6a91e ql/src/java/org/apache/hadoop/hive/ql/plan/ShowGrantDesc.java aa88153 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java 5c94217 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeObject.java 9e9ef71 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveV1Authorizer.java fbc0090 ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 98c2924 ql/src/test/org/apache/hadoop/hive/ql/parse/TestQBCompact.java 5f32d5f ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/PrivilegesTestBase.java 93901ec ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestHiveAuthorizationTaskFactory.java ab0d80e ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestPrivilegesV1.java fd827ad ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestPrivilegesV2.java 9499986 ql/src/test/queries/clientpositive/alter_rename_table.q PRE-CREATION ql/src/test/results/clientnegative/alter_concatenate_indexed_table.q.out 500d45d ql/src/test/results/clientnegative/alter_view_failure6.q.out cfbaca8 ql/src/test/results/clientnegative/merge_negative_1.q.out 95f6678 ql/src/test/results/clientnegative/merge_negative_2.q.out b3422e1 ql/src/test/results/clientnegative/show_columns3.q.out
[jira] [Commented] (HIVE-4064) Handle db qualified names consistently across all HiveQL statements
[ https://issues.apache.org/jira/browse/HIVE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092412#comment-14092412 ] Hive QA commented on HIVE-4064: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660896/HIVE-4064.7.patch.txt {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 5875 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/246/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/246/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-246/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12660896 Handle db qualified names consistently across all HiveQL statements --- Key: HIVE-4064 URL: https://issues.apache.org/jira/browse/HIVE-4064 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Navis Attachments: HIVE-4064-1.patch, HIVE-4064.1.patch.txt, HIVE-4064.2.patch.txt, HIVE-4064.3.patch.txt, HIVE-4064.4.patch.txt, HIVE-4064.5.patch.txt, HIVE-4064.6.patch.txt, HIVE-4064.7.patch.txt Hive doesn't consistently handle db qualified names across all HiveQL statements. While some HiveQL statements such as SELECT support DB qualified names, other such as CREATE INDEX doesn't. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7592) List Jars or Files are not supported by Beeline
[ https://issues.apache.org/jira/browse/HIVE-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7592: - Labels: TODOC14 (was: ) List Jars or Files are not supported by Beeline --- Key: HIVE-7592 URL: https://issues.apache.org/jira/browse/HIVE-7592 Project: Hive Issue Type: Bug Components: CLI Reporter: Ferdinand Xu Assignee: Navis Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-7592.1.patch.txt Through adding jars or files are supported by Beeline, List jars or Files are still not supported. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7673) database missing from output WriteEntity for create-table-as-select
Thejas M Nair created HIVE-7673: --- Summary: database missing from output WriteEntity for create-table-as-select Key: HIVE-7673 URL: https://issues.apache.org/jira/browse/HIVE-7673 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair In case of create-table-as-select query, the database the table belongs to is not among the objects to be authorized. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23674: Handle db qualified names consistently across all HiveQL statements
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23674/#review50141 --- Ship it! Ship It! - Thejas Nair On Aug. 11, 2014, 12:53 a.m., Navis Ryu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23674/ --- (Updated Aug. 11, 2014, 12:53 a.m.) Review request for hive and Thejas Nair. Bugs: HIVE-4064 https://issues.apache.org/jira/browse/HIVE-4064 Repository: hive-git Description --- Hive doesn't consistently handle db qualified names across all HiveQL statements. While some HiveQL statements such as SELECT support DB qualified names, other such as CREATE INDEX doesn't. Diffs - itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java c91b15c itests/util/src/main/java/org/apache/hadoop/hive/ql/hooks/CheckColumnAccessHook.java 14fc430 metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java ea866c5 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 6e689d0 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 5a56ced metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 760777a metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 74b1432 ql/src/java/org/apache/hadoop/hive/ql/Driver.java ea6ddbf ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 376e040 ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d22b1f6 ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 39b032e ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java 2e32fee ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java 989d0b5 ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 22945e3 ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnAccessInfo.java 939dc65 ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 67a3aa7 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ab1188a ql/src/java/org/apache/hadoop/hive/ql/parse/IndexUpdater.java 856ec2f ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7b86414 ql/src/java/org/apache/hadoop/hive/ql/parse/authorization/HiveAuthorizationTaskFactoryImpl.java 826bdf3 ql/src/java/org/apache/hadoop/hive/ql/plan/AlterIndexDesc.java 0318e4b ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableAlterPartDesc.java cf67e16 ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableSimpleDesc.java 541675c ql/src/java/org/apache/hadoop/hive/ql/plan/PrivilegeObjectDesc.java 9417220 ql/src/java/org/apache/hadoop/hive/ql/plan/RenamePartitionDesc.java 1b5fb9e ql/src/java/org/apache/hadoop/hive/ql/plan/ShowColumnsDesc.java fe6a91e ql/src/java/org/apache/hadoop/hive/ql/plan/ShowGrantDesc.java aa88153 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java 5c94217 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeObject.java 9e9ef71 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveV1Authorizer.java fbc0090 ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 98c2924 ql/src/test/org/apache/hadoop/hive/ql/parse/TestQBCompact.java 5f32d5f ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/PrivilegesTestBase.java 93901ec ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestHiveAuthorizationTaskFactory.java ab0d80e ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestPrivilegesV1.java fd827ad ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestPrivilegesV2.java 9499986 ql/src/test/queries/clientpositive/alter_rename_table.q PRE-CREATION ql/src/test/results/clientnegative/alter_concatenate_indexed_table.q.out 500d45d ql/src/test/results/clientnegative/alter_view_failure6.q.out cfbaca8 ql/src/test/results/clientnegative/merge_negative_1.q.out 95f6678 ql/src/test/results/clientnegative/merge_negative_2.q.out b3422e1 ql/src/test/results/clientnegative/show_columns3.q.out 09068b7 ql/src/test/results/clientnegative/show_tableproperties1.q.out ca54088 ql/src/test/results/clientnegative/temp_table_index.q.out 8ec5c0a ql/src/test/results/clientpositive/alter_rename_table.q.out PRE-CREATION ql/src/test/results/clientpositive/drop_multi_partitions.q.out eae57f3 ql/src/test/results/clientpositive/input3.q.out 547449c ql/src/test/results/clientpositive/insert2_overwrite_partitions.q.out 21bd257
[jira] [Commented] (HIVE-4064) Handle db qualified names consistently across all HiveQL statements
[ https://issues.apache.org/jira/browse/HIVE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092429#comment-14092429 ] Thejas M Nair commented on HIVE-4064: - +1 Handle db qualified names consistently across all HiveQL statements --- Key: HIVE-4064 URL: https://issues.apache.org/jira/browse/HIVE-4064 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Navis Attachments: HIVE-4064-1.patch, HIVE-4064.1.patch.txt, HIVE-4064.2.patch.txt, HIVE-4064.3.patch.txt, HIVE-4064.4.patch.txt, HIVE-4064.5.patch.txt, HIVE-4064.6.patch.txt, HIVE-4064.7.patch.txt Hive doesn't consistently handle db qualified names across all HiveQL statements. While some HiveQL statements such as SELECT support DB qualified names, other such as CREATE INDEX doesn't. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7673) database missing from output WriteEntity for create-table-as-select
[ https://issues.apache.org/jira/browse/HIVE-7673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-7673: Attachment: HIVE-7673.1.patch database missing from output WriteEntity for create-table-as-select --- Key: HIVE-7673 URL: https://issues.apache.org/jira/browse/HIVE-7673 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-7673.1.patch In case of create-table-as-select query, the database the table belongs to is not among the objects to be authorized. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7592) List Jars or Files are not supported by Beeline
[ https://issues.apache.org/jira/browse/HIVE-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092432#comment-14092432 ] Lefty Leverenz commented on HIVE-7592: -- Doc note: This should be described in the Beeline section of the HiveServer2 Clients wikidoc before 0.14.0 is released, with release information and a link back to this JIRA ticket. But I don't see any existing documentation for adding jars or files with Beeline, so that should be documented too. Examples would help. Also, the new list value for configuration parameter *hive.security.command.whitelist* should be added to the parameter description in Configuration Properties (again, with release information). * [HiveServer2 Clients -- Beeline | https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-Beeline–NewCommandLineShell] * [Configuration Properties -- hive.security.command.whitelist | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.security.command.whitelist] List Jars or Files are not supported by Beeline --- Key: HIVE-7592 URL: https://issues.apache.org/jira/browse/HIVE-7592 Project: Hive Issue Type: Bug Components: CLI Reporter: Ferdinand Xu Assignee: Navis Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-7592.1.patch.txt Through adding jars or files are supported by Beeline, List jars or Files are still not supported. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24445: HIVE-7642, Set hive input format by configuration.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24445/#review50144 --- Hi, Thank you very much for taking this on! I have one question below. ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java https://reviews.apache.org/r/24445/#comment87742 If the configuration is incorrect, perhaps we should throw an error? Meaning re-throw the exception? - Brock Noland On Aug. 7, 2014, 7:30 a.m., chengxiang li wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24445/ --- (Updated Aug. 7, 2014, 7:30 a.m.) Review request for hive, Brock Noland and Szehon Ho. Bugs: HIVE-7642 https://issues.apache.org/jira/browse/HIVE-7642 Repository: hive-git Description --- Currently hive input format is hard coded as HiveInputFormat, we should set this parameter from configuration. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 45eff67 Diff: https://reviews.apache.org/r/24445/diff/ Testing --- Thanks, chengxiang li
Re: Review Request 24377: HIVE-7142 Hive multi serialization encoding support
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24377/#review50145 --- Ultimately it'd be nice to make Hive's internal codec configurable. That's an enormous project so I think a solution like this one is a useful bridge. serde/src/java/org/apache/hadoop/hive/serde2/AbstractEncodingAwareSerDe.java https://reviews.apache.org/r/24377/#comment87743 Can we make these constants? serialization.encoding is probably already available somewhere. - Brock Noland On Aug. 6, 2014, 9:11 a.m., chengxiang li wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24377/ --- (Updated Aug. 6, 2014, 9:11 a.m.) Review request for hive. Bugs: HIVE-7142 https://issues.apache.org/jira/browse/HIVE-7142 Repository: hive-git Description --- Currently Hive only support serialize data into UTF-8 charset bytes or deserialize from UTF-8 bytes, real world users may want to load different kinds of encoded data into hive directly. This jira is dedicated to support serialize/deserialize all kinds of encoded data in SerDe layer. For user, only need to configure serialization encoding on table level by set serialization encoding through serde parameter, for example: CREATE TABLE person(id INT, name STRING, desc STRING)ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES(serialization.encoding='GBK'); or ALTER TABLE person SET SERDEPROPERTIES ('serialization.encoding'='GBK'); LIMITATIONS: Only LazySimpleSerDe support serialization.encoding property in this patch. Diffs - serde/src/java/org/apache/hadoop/hive/serde2/AbstractEncodingAwareSerDe.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/DelimitedJSONSerDe.java 179f9b5 serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java b7fb048 serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java fb55c70 Diff: https://reviews.apache.org/r/24377/diff/ Testing --- Thanks, chengxiang li
[jira] [Updated] (HIVE-7540) NotSerializableException encountered when using sortByKey transformation
[ https://issues.apache.org/jira/browse/HIVE-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7540: --- Attachment: HIVE-7540.3-spark.patch The following patch adds a repository which I control. I can push spark 1.1-SNAPSHOT whenever we require. NotSerializableException encountered when using sortByKey transformation Key: HIVE-7540 URL: https://issues.apache.org/jira/browse/HIVE-7540 Project: Hive Issue Type: Bug Components: Spark Environment: Spark-1.0.1 Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-7540-spark.patch, HIVE-7540.2-spark.patch, HIVE-7540.3-spark.patch This exception is thrown when sortByKey is used as the shuffle transformation between MapWork and ReduceWork: {quote} org.apache.spark.SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException: org.apache.hadoop.io.BytesWritable at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1049) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1033) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1031) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1031) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:772) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:715) at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:719) at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:718) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:718) at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:699) … {quote} The root cause is that the RangePartitioner used by sortByKey contains rangeBounds: Array[BytesWritable], which is considered not serializable in spark. A workaround to this issue is to set the number of partitions to 1 when calling sortByKey, in which case the rangeBounds will be just an empty array. NO PRECOMMIT TESTS. This is for spark branch only. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-7540) NotSerializableException encountered when using sortByKey transformation
[ https://issues.apache.org/jira/browse/HIVE-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland resolved HIVE-7540. Resolution: Fixed Fix Version/s: spark-branch Thank you very much for your contribution! I have committed the patch to trunk. NotSerializableException encountered when using sortByKey transformation Key: HIVE-7540 URL: https://issues.apache.org/jira/browse/HIVE-7540 Project: Hive Issue Type: Bug Components: Spark Environment: Spark-1.0.1 Reporter: Rui Li Assignee: Rui Li Fix For: spark-branch Attachments: HIVE-7540-spark.patch, HIVE-7540.2-spark.patch, HIVE-7540.3-spark.patch This exception is thrown when sortByKey is used as the shuffle transformation between MapWork and ReduceWork: {quote} org.apache.spark.SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException: org.apache.hadoop.io.BytesWritable at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1049) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1033) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1031) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1031) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:772) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:715) at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:719) at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:718) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:718) at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:699) … {quote} The root cause is that the RangePartitioner used by sortByKey contains rangeBounds: Array[BytesWritable], which is considered not serializable in spark. A workaround to this issue is to set the number of partitions to 1 when calling sortByKey, in which case the rangeBounds will be just an empty array. NO PRECOMMIT TESTS. This is for spark branch only. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7674) Update to Spark 1.1
[ https://issues.apache.org/jira/browse/HIVE-7674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7674: --- Issue Type: Sub-task (was: Task) Parent: HIVE-7292 Update to Spark 1.1 --- Key: HIVE-7674 URL: https://issues.apache.org/jira/browse/HIVE-7674 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland In HIVE-7540 we added a custom repo to use Spark 1.1. Once 1.1 is released we need to remove this repo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7674) Update to Spark 1.1
Brock Noland created HIVE-7674: -- Summary: Update to Spark 1.1 Key: HIVE-7674 URL: https://issues.apache.org/jira/browse/HIVE-7674 Project: Hive Issue Type: Task Components: Spark Reporter: Brock Noland In HIVE-7540 we added a custom repo to use Spark 1.1. Once 1.1 is released we need to remove this repo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7674) Update to Spark 1.1
[ https://issues.apache.org/jira/browse/HIVE-7674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7674: --- Priority: Blocker (was: Major) Update to Spark 1.1 --- Key: HIVE-7674 URL: https://issues.apache.org/jira/browse/HIVE-7674 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Priority: Blocker In HIVE-7540 we added a custom repo to use Spark 1.1. Once 1.1 is released we need to remove this repo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7653) Hive AvroSerDe does not support circular references in Schema
[ https://issues.apache.org/jira/browse/HIVE-7653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sachin Goyal updated HIVE-7653: --- Attachment: HIVE-7653.2.patch Attaching patch file created using {code} git diff HIVE-7653.2.patch {code} Hive AvroSerDe does not support circular references in Schema - Key: HIVE-7653 URL: https://issues.apache.org/jira/browse/HIVE-7653 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Sachin Goyal Attachments: HIVE-7653.1.patch, HIVE-7653.2.patch Avro allows nullable circular references but Hive AvroSerDe does not. Example of circular references (passing in Avro but failing in AvroSerDe): {code} class AvroCycleParent { AvroCycleChild child; public AvroCycleChild getChild () {return child;} public void setChild (AvroCycleChild child) {this.child = child;} } class AvroCycleChild { AvroCycleParent parent; public AvroCycleParent getParent () {return parent;} public void setParent (AvroCycleParent parent) {this.parent = parent;} } {code} Due to this discrepancy, Hive is unable to read Avro records having circular-references. For some third-party code with such references, it becomes very hard to directly serialize it with Avro and use in Hive. I have a patch for this with a unit-test and I will submit it shortly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6847) Improve / fix bugs in Hive scratch dir setup
[ https://issues.apache.org/jira/browse/HIVE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092455#comment-14092455 ] Hive QA commented on HIVE-6847: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660898/HIVE-6847.2.patch {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 5883 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.ql.parse.authorization.TestSessionUserName.testSessionConstructorUser org.apache.hive.jdbc.TestJdbcWithMiniMr.org.apache.hive.jdbc.TestJdbcWithMiniMr org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testAllowedCommands org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testAuthorization1 org.apache.hive.service.cli.TestScratchDir.testLocalScratchDirs org.apache.hive.service.cli.TestScratchDir.testResourceDirs org.apache.hive.service.cli.TestScratchDir.testScratchDirs {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/247/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/247/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-247/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12660898 Improve / fix bugs in Hive scratch dir setup Key: HIVE-6847 URL: https://issues.apache.org/jira/browse/HIVE-6847 Project: Hive Issue Type: Bug Components: CLI, HiveServer2 Affects Versions: 0.14.0 Reporter: Vikram Dixit K Assignee: Vaibhav Gumashta Fix For: 0.14.0 Attachments: HIVE-6847.1.patch, HIVE-6847.2.patch Currently, the hive server creates scratch directory and changes permission to 777 however, this is not great with respect to security. We need to create user specific scratch directories instead. Also refer to HIVE-6782 1st iteration of the patch for approach. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7521) Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process()
[ https://issues.apache.org/jira/browse/HIVE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092468#comment-14092468 ] KangHS commented on HIVE-7521: -- Thanks, Ashutosh Chauhan :) Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process() Key: HIVE-7521 URL: https://issues.apache.org/jira/browse/HIVE-7521 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Ted Yu Assignee: KangHS Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7521.patch {code} ExprNodeConstantDesc c = (ExprNodeConstantDesc) condition; if (c.getValue() != Boolean.FALSE) { return null; } {code} equals() should be called instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-7332) Create SparkClient, interface to Spark cluster
[ https://issues.apache.org/jira/browse/HIVE-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li resolved HIVE-7332. - Resolution: Fixed Create SparkClient, interface to Spark cluster -- Key: HIVE-7332 URL: https://issues.apache.org/jira/browse/HIVE-7332 Project: Hive Issue Type: Sub-task Reporter: Xuefu Zhang Assignee: Chengxiang Li SparkClient is responsible for Spark job submission, monitoring, progress and error reporting, etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7332) Create SparkClient, interface to Spark cluster
[ https://issues.apache.org/jira/browse/HIVE-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092479#comment-14092479 ] Chengxiang Li commented on HIVE-7332: - Added HIVE-7438, HIVE-7439 to track job status monitoring and statistics. close this issue. Create SparkClient, interface to Spark cluster -- Key: HIVE-7332 URL: https://issues.apache.org/jira/browse/HIVE-7332 Project: Hive Issue Type: Sub-task Reporter: Xuefu Zhang Assignee: Chengxiang Li SparkClient is responsible for Spark job submission, monitoring, progress and error reporting, etc. -- This message was sent by Atlassian JIRA (v6.2#6252)