Hive-trunk-hadoop2 - Build # 563 - Still Failing
Changes for Build #531 [thejas] HIVE-5483 : use metastore statistics to optimize max/min/etc. queries (Ashutosh Chauhan via Thejas Nair) [daijy] HIVE-5510: [WebHCat] GET job/queue return wrong job information [brock] HIVE-5610 - Merge maven branch into trunk (delete ant) [brock] HIVE-5610 - Merge maven branch into trunk (maven rollforward) [brock] HIVE-5610 - Merge maven branch into trunk (patch) [hashutosh] HIVE-5693 : Rewrite some tests to reduce test time (Navis via Ashutosh Chauhan) [hashutosh] HIVE-5582 : Implement BETWEEN filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [hashutosh] HIVE-5556 : Pushdown join conditions (Harish Butani via Ashutosh Chauhan) Changes for Build #532 [brock] HIVE-5716 - Fix broken tests after maven merge (1) (Brock Noland reviewed by Thejas M Nair and Ashutosh Chauhan) Changes for Build #533 [hashutosh] HIVE-3959 : Update Partition Statistics in Metastore Layer (Ashutosh Chauhan, Bhushan Mandhani, Gang Tim Liu via Thejas Nair) Changes for Build #534 [hashutosh] HIVE-5503 : TopN optimization in VectorReduceSink (Sergey Shelukhin via Ashutosh Chauhan) [brock] HIVE-5695 - PTest2 fix shutdown, duplicate runs, and add client retry [brock] HIVE-5708 - PTest2 should trim long logs when posting to jira Changes for Build #535 [thejas] HIVE-5547 : webhcat pig job submission should ship hive tar if -usehcatalog is specified (Eugene Koifman via Thejas Nair) [thejas] HIVE-5715 : HS2 should not start a session for every command (Gunther Hagleitner via Thejas Nair) Changes for Build #536 Changes for Build #537 [brock] HIVE-5740: Tar files should extract to the directory of the same name minus tar.gz (Brock Noland reviewed by Xuefu Zhang) [brock] HIVE-5611: Add assembly (i.e.) tar creation to pom (Szehon Ho via Brock Noland) [brock] HIVE-5707: Validate values for ConfVar (Navis via Brock Noland) [brock] HIVE-5721: Incremental build is disabled by MCOMPILER-209 (Navis via Brock Noland) Changes for Build #538 [brock] HIVE-5354 - Decimal precision/scale support in ORC file (Xuefu Zhang via Brock Noland) [brock] HIVE-4523 - round() function with specified decimal places not consistent with mysql (Xuefu Zhang via Brock Noland) [thejas] HIVE-5542 : Webhcat is failing to run ddl command on a secure cluster (Sushanth Sowmyan via Thejas Nair) Changes for Build #539 [brock] HIVE-5730: Beeline throws non-terminal NPE upon starting, after mavenization (Szehon Ho reviewed by Navis) Changes for Build #540 [omalley] HIVE-5425 Provide a configuration option to control the default stripe size for ORC. (omalley reviewed by gunther) [omalley] Revert HIVE-5583 since it broke the build. [hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5355 - JDBC support for decimal precision/scale Changes for Build #541 [hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5773 - Fix build due to conflict between HIVE-5711 and HIVE-5713 [brock] HIVE-5711 - Fix eclipse:eclipse maven goal (Carl Steinbach via Brock Noland) [brock] HIVE-5752 - log4j properties appear to have been lost in maven upgrade (Sergey Shelukhin via Brock Noland) [brock] HIVE-5713 - Verify versions of libraries post maven merge (Brock Noland reviewed by Gunther Hagleitner) [brock] HIVE-5765 - Beeline throws NPE when -e option is used (Szehon Ho via Brock Noland) [xuefu] HIVE-5726: The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant (reviewed by Brock) [xuefu] HIVE-5655: Hive incorrecly handles divide-by-zero case (reviewed by Edward and Brock) [xuefu] HIVE-5191: Add char data type (Jason via Xuefu) Changes for Build #542 [brock] HIVE-5351 - Secure-Socket-Layer (SSL) support for HiveServer2 (Prasad Mujumdar via Brock Noland) Changes for Build #543 [brock] HIVE-5780 - Add the missing declaration of HIVE_CLI_SERVICE_PROTOCOL_V4 in TCLIService.thrift (Prasad Mujumdar via Brock Noland) Changes for Build #544 [gunther] HIVE-5601: NPE in ORC's PPD when using select * from table with where predicate (Prasanth J via Owen O'Malley and Gunther Hagleitner) [gunther] HIVE-5562: Provide stripe level column statistics in ORC (Patch by Prasanth J, reviewed by Owen O'Malley, committed by Gunther Hagleitner) [hashutosh] HIVE-3777 : add a property in the partition to figure out if stats are accurate (Ashutosh Chauhan via Thejas Nair) Changes for Build #545 [hashutosh] HIVE-5691 : Intermediate columns are incorrectly initialized for partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner) [hashutosh] HIVE-5779 : Subquery in where clause with distinct fails with mapjoin turned on with serialization error. (Ashutosh Chauhan via Harish Butani) [gunther] HIVE-5632 (partial): Adding test data to data/files to enable
Hive-trunk-h0.21 - Build # 2464 - Still Failing
Changes for Build #2434 [hashutosh] HIVE-3959 : Update Partition Statistics in Metastore Layer (Ashutosh Chauhan, Bhushan Mandhani, Gang Tim Liu via Thejas Nair) Changes for Build #2435 [hashutosh] HIVE-5503 : TopN optimization in VectorReduceSink (Sergey Shelukhin via Ashutosh Chauhan) [brock] HIVE-5695 - PTest2 fix shutdown, duplicate runs, and add client retry [brock] HIVE-5708 - PTest2 should trim long logs when posting to jira Changes for Build #2436 [thejas] HIVE-5547 : webhcat pig job submission should ship hive tar if -usehcatalog is specified (Eugene Koifman via Thejas Nair) [thejas] HIVE-5715 : HS2 should not start a session for every command (Gunther Hagleitner via Thejas Nair) Changes for Build #2437 Changes for Build #2438 [brock] HIVE-5740: Tar files should extract to the directory of the same name minus tar.gz (Brock Noland reviewed by Xuefu Zhang) [brock] HIVE-5611: Add assembly (i.e.) tar creation to pom (Szehon Ho via Brock Noland) [brock] HIVE-5707: Validate values for ConfVar (Navis via Brock Noland) [brock] HIVE-5721: Incremental build is disabled by MCOMPILER-209 (Navis via Brock Noland) Changes for Build #2439 [brock] HIVE-5354 - Decimal precision/scale support in ORC file (Xuefu Zhang via Brock Noland) [brock] HIVE-4523 - round() function with specified decimal places not consistent with mysql (Xuefu Zhang via Brock Noland) [thejas] HIVE-5542 : Webhcat is failing to run ddl command on a secure cluster (Sushanth Sowmyan via Thejas Nair) Changes for Build #2440 [brock] HIVE-5730: Beeline throws non-terminal NPE upon starting, after mavenization (Szehon Ho reviewed by Navis) Changes for Build #2441 [omalley] HIVE-5425 Provide a configuration option to control the default stripe size for ORC. (omalley reviewed by gunther) [omalley] Revert HIVE-5583 since it broke the build. [hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5355 - JDBC support for decimal precision/scale Changes for Build #2443 [brock] HIVE-5351 - Secure-Socket-Layer (SSL) support for HiveServer2 (Prasad Mujumdar via Brock Noland) [hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5773 - Fix build due to conflict between HIVE-5711 and HIVE-5713 [brock] HIVE-5711 - Fix eclipse:eclipse maven goal (Carl Steinbach via Brock Noland) [brock] HIVE-5752 - log4j properties appear to have been lost in maven upgrade (Sergey Shelukhin via Brock Noland) [brock] HIVE-5713 - Verify versions of libraries post maven merge (Brock Noland reviewed by Gunther Hagleitner) [brock] HIVE-5765 - Beeline throws NPE when -e option is used (Szehon Ho via Brock Noland) [xuefu] HIVE-5726: The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant (reviewed by Brock) [xuefu] HIVE-5655: Hive incorrecly handles divide-by-zero case (reviewed by Edward and Brock) [xuefu] HIVE-5191: Add char data type (Jason via Xuefu) Changes for Build #2444 [brock] HIVE-5780 - Add the missing declaration of HIVE_CLI_SERVICE_PROTOCOL_V4 in TCLIService.thrift (Prasad Mujumdar via Brock Noland) Changes for Build #2445 [gunther] HIVE-5601: NPE in ORC's PPD when using select * from table with where predicate (Prasanth J via Owen O'Malley and Gunther Hagleitner) [gunther] HIVE-5562: Provide stripe level column statistics in ORC (Patch by Prasanth J, reviewed by Owen O'Malley, committed by Gunther Hagleitner) [hashutosh] HIVE-3777 : add a property in the partition to figure out if stats are accurate (Ashutosh Chauhan via Thejas Nair) Changes for Build #2446 [hashutosh] HIVE-5691 : Intermediate columns are incorrectly initialized for partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner) [hashutosh] HIVE-5779 : Subquery in where clause with distinct fails with mapjoin turned on with serialization error. (Ashutosh Chauhan via Harish Butani) [gunther] HIVE-5632 (partial): Adding test data to data/files to enable pre-commit tests to run. (Prasanth J via Gunther Hagleitner) Changes for Build #2447 [cws] HIVE-5786: Remove HadoopShims methods that were needed for pre-Hadoop 0.20 (Jason Dere via cws) [thejas] HIVE-5229 : Better thread management for HiveServer2 async threads (Vaibhav Gumashta via Thejas Nair) [gunther] HIVE-5745: TestHiveLogging is failing (at least on mac) (Gunther Hagleitner, reviewed by Ashutosh Chauhan) Changes for Build #2448 [hashutosh] HIVE-5699 : Add unit test for vectorized BETWEEN for timestamp inputs (Eric Hanson via Ashutosh Chauhan) [hashutosh] HIVE-5767 : in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT (Sergey Shelukhin via Ashutosh Chauhan) [hashutosh] HIVE-5657 : TopN produces incorrect results with count(distinct) (Sergey Shelukhin via Ashutosh Chauhan) Changes for Build #2450
[jira] [Commented] (HIVE-5849) Improve the stats of operators based on heuristics in the absence of any column statistics
[ https://issues.apache.org/jira/browse/HIVE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829832#comment-13829832 ] Hive QA commented on HIVE-5849: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12615235/HIVE-5849.6.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4680 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/396/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/396/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12615235 Improve the stats of operators based on heuristics in the absence of any column statistics -- Key: HIVE-5849 URL: https://issues.apache.org/jira/browse/HIVE-5849 Project: Hive Issue Type: Sub-task Components: Query Processor, Statistics Reporter: Prasanth J Assignee: Prasanth J Fix For: 0.13.0 Attachments: HIVE-5849.1.patch.txt, HIVE-5849.2.patch.txt, HIVE-5849.3.patch, HIVE-5849.3.patch.txt, HIVE-5849.4.javaonly.patch, HIVE-5849.5.patch, HIVE-5849.6.patch In the absence of any column statistics, operators will simply use the statistics from its parents. It is useful to apply some heuristics to update basic statistics (number of rows and data size) in the absence of any column statistics. This will be worst case scenario. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5827) Incorrect location of logs for failed tests.
[ https://issues.apache.org/jira/browse/HIVE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829926#comment-13829926 ] Hive QA commented on HIVE-5827: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12615264/HIVE-5827.2.patch {color:green}SUCCESS:{color} +1 4680 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/397/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/397/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12615264 Incorrect location of logs for failed tests. - Key: HIVE-5827 URL: https://issues.apache.org/jira/browse/HIVE-5827 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5827.1.patch, HIVE-5827.2.patch Extending HIVE-5790 to fix other tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5871) Use multiple-characters as field delimiter
Rui Li created HIVE-5871: Summary: Use multiple-characters as field delimiter Key: HIVE-5871 URL: https://issues.apache.org/jira/browse/HIVE-5871 Project: Hive Issue Type: Improvement Components: Contrib Affects Versions: 0.12.0 Reporter: Rui Li Add a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe, users can specify a multiple-character field delimiter when creating tables. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5871) Use multiple-characters as field delimiter
[ https://issues.apache.org/jira/browse/HIVE-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-5871: - Attachment: HIVE-5871.patch This implementation mainly relies on LazySimpleSerDe for serialization and deserialization. I added some methods to LazyStruct to parse a row delimited by multiple-character string. Another difference from LazySimpleSerDe is that MultiDelimitSerDe doesn't use Base64 to encode binary fields in serialization. Because the encoded string may interfere with the delimiter. I also modified LazyBinary, so that when it deserializes a binary field and is unable to Base64 decode the field, it just keeps the data unchanged. A simple use case is as follow: create table test (id string,hivearray arraybinary,hivemap mapstring,int) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES (field.delimited=[,],collection.delimited=:,mapkey.delimited=@); where field.delimited is the multiple-char field delimiter. collection.delimited is the delimiter for collection items. mapkey.delimited is the delimiter for keys and values in maps. We currently don't support multiple-char for these two delimiters. Use multiple-characters as field delimiter -- Key: HIVE-5871 URL: https://issues.apache.org/jira/browse/HIVE-5871 Project: Hive Issue Type: Improvement Components: Contrib Affects Versions: 0.12.0 Reporter: Rui Li Attachments: HIVE-5871.patch Add a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe, users can specify a multiple-character field delimiter when creating tables. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5872) Make UDAFs such as GenericUDAFSum report accurate precision/scale for decimal types
Xuefu Zhang created HIVE-5872: - Summary: Make UDAFs such as GenericUDAFSum report accurate precision/scale for decimal types Key: HIVE-5872 URL: https://issues.apache.org/jira/browse/HIVE-5872 Project: Hive Issue Type: Improvement Components: Types, UDF Affects Versions: 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Currently UDAFs are still reporting system default precision/scale (38, 18) for decimal results. Not only this is coarse, but also this can cause problems in subsequent operators such as division, where the result is dependent on the precision/scale of the input, which can go out of bound (38,38). Thus, these UDAFs should correctly report the precision/scale of the result. -- This message was sent by Atlassian JIRA (v6.1#6144)
import proto buffer file in a python transform script error(Very urgent)!
Hi Dear! i user transform of hive to Analysis logs,and than, some fields of logs is pb type, i done is follow: 1) HIVESQL is : add file hive_shift_parse.py add file locationShift_pb2.py select transform(log) using 'python hive_shift_parse.py' as sm_datetime,sm_appid,sm_language,sm_iosMaxVersion,sm_iosMinVersion,sm_messageid,sm_logtype,sm_request,sm_response,sm_status,sm_responsetime,sm_ip,sm_province,sm_city,sm_town,sm_day_time from snowman_service_raw; 2) import of hive_shift_parse.py is follow: import urllib2,sys,os,re,datetime,json,time,math import fileinput import base64 import locationShift_pb2 3) locationShift_pb2.py is a pb file。 4)run error is follow: Traceback (most recent call last): File hive_shift_parse.py.py, line 6, in ? import locationShift_pb2 ImportError: No module named locationShift_pb2 i search result in google,but can not Solve。i guest the problem is load of proto buffer(pb). thanks for help.
Re: [ANNOUNCE] New Hive Committers - Jitendra Nath Pandey and Eric Hanson
Congrats to both of you.. On Fri, Nov 22, 2013 at 1:26 PM, Lefty Leverenz leftylever...@gmail.comwrote: Congratulations, Jitendra and Eric! The more the merrier. -- Lefty On Thu, Nov 21, 2013 at 6:31 PM, Jarek Jarcec Cecho jar...@apache.orgwrote: Congratulations, good job! Jarcec On Thu, Nov 21, 2013 at 03:29:07PM -0800, Carl Steinbach wrote: The Apache Hive PMC has voted to make Jitendra Nath Pandey and Eric Hanson committers on the Apache Hive project. Please join me in congratulating Jitendra and Eric! Thanks. Carl -- _ The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
Re: [ANNOUNCE] New Hive Committers - Jitendra Nath Pandey and Eric Hanson
Congrats! On Nov 22, 2013, at 2:25 AM, Biswajit Nayak biswajit.na...@inmobi.com wrote: Congrats to both of you.. On Fri, Nov 22, 2013 at 1:26 PM, Lefty Leverenz leftylever...@gmail.com wrote: Congratulations, Jitendra and Eric! The more the merrier. -- Lefty On Thu, Nov 21, 2013 at 6:31 PM, Jarek Jarcec Cecho jar...@apache.org wrote: Congratulations, good job! Jarcec On Thu, Nov 21, 2013 at 03:29:07PM -0800, Carl Steinbach wrote: The Apache Hive PMC has voted to make Jitendra Nath Pandey and Eric Hanson committers on the Apache Hive project. Please join me in congratulating Jitendra and Eric! Thanks. Carl _ The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
hcat tests with MVN
Hi, I've noticed a couple of problems. 1) running mvn tests from hcatalog/ runs the tests under core and hcatalog-pig-adapter submodules but not any of the other modules (webhcat/java-client, webhcat/svr, etc). Though if I cd to the appropriate submodule ant run mvn test the tests are run. 2) mvn surefire-report:report from hcatalog generates .html files with test results in the ./target/site/surefire-report.html of each submodule, but not a single .html that includes results for all tests. (hcatalog/target/site/surefire-report.html is generated but contains 0 tests) Does anyone have suggestions on how to fix these? Thanks, Eugene -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Updated] (HIVE-5833) Remove versions from child module dependencies
[ https://issues.apache.org/jira/browse/HIVE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated HIVE-5833: - Status: Patch Available (was: Open) Remove versions from child module dependencies -- Key: HIVE-5833 URL: https://issues.apache.org/jira/browse/HIVE-5833 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Attachments: HIVE-5833.2.patch, HIVE-5833.patch HIVE-5741 moved all dependencies to the plugin management section of the parent pom therefore we can remove {noformat}version${dep.version}/version{noformat} from all dependencies in child modules. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: [ANNOUNCE] New Hive Committers - Jitendra Nath Pandey and Eric Hanson
Congrats to both of you! On Fri, Nov 22, 2013 at 9:34 AM, Jason Dere jd...@hortonworks.com wrote: Congrats! On Nov 22, 2013, at 2:25 AM, Biswajit Nayak biswajit.na...@inmobi.com wrote: Congrats to both of you.. On Fri, Nov 22, 2013 at 1:26 PM, Lefty Leverenz leftylever...@gmail.com wrote: Congratulations, Jitendra and Eric! The more the merrier. -- Lefty On Thu, Nov 21, 2013 at 6:31 PM, Jarek Jarcec Cecho jar...@apache.org wrote: Congratulations, good job! Jarcec On Thu, Nov 21, 2013 at 03:29:07PM -0800, Carl Steinbach wrote: The Apache Hive PMC has voted to make Jitendra Nath Pandey and Eric Hanson committers on the Apache Hive project. Please join me in congratulating Jitendra and Eric! Thanks. Carl _ The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Review Request 15718: HIVE-5614: Subquery support: allow subquery expressions in having clause
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15718/#review29298 --- ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java https://reviews.apache.org/r/15718/#comment56454 It will be good to add a comment about various fields in Conjunct class. ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java https://reviews.apache.org/r/15718/#comment56455 Can this constructor be package protected instead of public ? ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java https://reviews.apache.org/r/15718/#comment56456 protected instead of public? ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java https://reviews.apache.org/r/15718/#comment56457 It will be good to add a comment, how behavior of ConjunctAnalyzer changes when forHavingClause = true instead of false. ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java https://reviews.apache.org/r/15718/#comment56458 Should this exception needs to be propagated up the stack. At the least, we should have LOG.warn() message here. ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java https://reviews.apache.org/r/15718/#comment56452 It will be good to add a comment here along the lines of there could be a subq in having clause, if so we need to generate subq plan followed by semi-join. ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java https://reviews.apache.org/r/15718/#comment56459 It will be good to add a comment how this new boolean changes behavior of this method. ql/src/test/queries/clientpositive/subquery_in_having.q https://reviews.apache.org/r/15718/#comment56453 It will be good to add a test which has a subq in both where clause as well as having clause ql/src/test/queries/clientpositive/subquery_in_having.q https://reviews.apache.org/r/15718/#comment56448 Same comment w.r.t map-join on. Also, if we support over clause in subq, it will be good to have a test for that. ql/src/test/queries/clientpositive/subquery_notexists_having.q https://reviews.apache.org/r/15718/#comment56449 It will be good to add a negative test where subq and outer query both uses same table alias. It seems in such cases we may generate incorrect results, so we should disable those. ql/src/test/results/clientpositive/subquery_in_having.q.out https://reviews.apache.org/r/15718/#comment56450 In this plan, we are first computing outq, then subq and then doing left semi-join on resultset of those two. As we discussed efficient way for this is to push filter conditions in subq to outer query to cut-down the output generated by outq. Though, I am not sure whether its better to do it in optimizer phase via Transformer or right here. Either ways, I think thats an optimization which we can do as a follow-up. ql/src/test/results/clientpositive/subquery_notexists_having.q.out https://reviews.apache.org/r/15718/#comment56451 First expression in this filter is redundant. Thats not strictly required. However, since there is an active work going on for constant folding optimization, this may get optimized way via that optimization. Either way, this can be done in follow-up. - Ashutosh Chauhan On Nov. 20, 2013, 6:04 p.m., Harish Butani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15718/ --- (Updated Nov. 20, 2013, 6:04 p.m.) Review request for hive and Ashutosh Chauhan. Bugs: HIVE-5614 https://issues.apache.org/jira/browse/HIVE-5614 Repository: hive-git Description --- support for subquery predicates in having clause. SubTask of HIVE-784 Diffs - ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java fa111cc ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java 3e8215d ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7979873 ql/src/test/queries/clientpositive/subquery_exists_having.q PRE-CREATION ql/src/test/queries/clientpositive/subquery_in_having.q PRE-CREATION ql/src/test/queries/clientpositive/subquery_notexists_having.q PRE-CREATION ql/src/test/queries/clientpositive/subquery_notin_having.q PRE-CREATION ql/src/test/results/clientpositive/subquery_exists_having.q.out PRE-CREATION ql/src/test/results/clientpositive/subquery_in_having.q.out PRE-CREATION ql/src/test/results/clientpositive/subquery_multiinsert.q.out 8dfb485 ql/src/test/results/clientpositive/subquery_notexists_having.q.out PRE-CREATION ql/src/test/results/clientpositive/subquery_notin_having.q.out PRE-CREATION Diff: https://reviews.apache.org/r/15718/diff/ Testing --- added new tests:
[jira] [Updated] (HIVE-5614) Subquery support: allow subquery expressions in having clause
[ https://issues.apache.org/jira/browse/HIVE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5614: --- Status: Open (was: Patch Available) Some comments on RB. Subquery support: allow subquery expressions in having clause - Key: HIVE-5614 URL: https://issues.apache.org/jira/browse/HIVE-5614 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5614.1.patch, HIVE-5614.2.patch, HIVE-5614.3.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4518) Counter Strike: Operation Operator
[ https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-4518: - Attachment: HIVE-4518.11.patch patch v11 - counter names don't need to be configurable. Also rebase with trunk Counter Strike: Operation Operator -- Key: HIVE-4518 URL: https://issues.apache.org/jira/browse/HIVE-4518 Project: Hive Issue Type: Improvement Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4518.1.patch, HIVE-4518.10.patch, HIVE-4518.11.patch, HIVE-4518.2.patch, HIVE-4518.3.patch, HIVE-4518.4.patch, HIVE-4518.5.patch, HIVE-4518.6.patch.txt, HIVE-4518.7.patch, HIVE-4518.8.patch, HIVE-4518.9.patch Queries of the form: from foo insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... Generate a huge amount of counters. The reason is that task.progress is turned on for dynamic partitioning queries. The counters not only make queries slower than necessary (up to 50%) you will also eventually run out. That's because we're wrapping them in enum values to comply with hadoop 0.17. The real reason we turn task.progress on is that we need CREATED_FILES and FATAL counters to ensure dynamic partitioning queries don't go haywire. The counters have counter-intuitive names like C1 through C1000 and don't seem really useful by themselves. With hadoop 20+ you don't need to wrap the counters anymore, each operator can simply create and increment counters. That should simplify the code a lot. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken
[ https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830247#comment-13830247 ] Remus Rusanu commented on HIVE-5817: I think the only real problem operator is JOIN. Is not necessarily ‘one VC per operator’ but more like ‘one VC per query region’ where query region is defined by boundaries between different VS requirements (basically different result shapes). An operator like JOIN is one that clearly introduces a boundary, and the interesting part is that it needs two vectorization contexts: one for it’s input(s) and one for it’s output. So it would be more along the line that during vectorization each operator takes an VC (for its input, provided by its parent operator) and gives out a VC for its output, for its child operators to consume. Most operators would give out the same VC they get as input (ie. they do not change shape). And there is serialization too, which is handled separately (as properties added to the Map). I'll try to come up with actual code over this week end. column name to index mapping in VectorizationContext is broken -- Key: HIVE-5817 URL: https://issues.apache.org/jira/browse/HIVE-5817 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Sergey Shelukhin Assignee: Remus Rusanu Priority: Critical Attachments: HIVE-5817-uniquecols.broken.patch, HIVE-5817.00-broken.patch Columns coming from different operators may have the same internal names (_colNN). There exists a query in the form {{select b.cb, a.ca from a JOIN b ON ... JOIN x ON ...;}} (distilled from a more complex query), which runs ok w/o vectorization. With vectorization, it will run ok for most ca, but for some ca it will fail (or can probably return incorrect results). That is because when building column-to-VRG-index map in VectorizationContext, internal column name for ca that the first map join operator adds to the mapping may be the same as internal name for cb that the 2nd one tries to add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to output stuff, it retrieves wrong index from the map by name, and then wrong vector from VRG. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HIVE-5817) column name to index mapping in VectorizationContext is broken
[ https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu reassigned HIVE-5817: -- Assignee: Remus Rusanu (was: Sergey Shelukhin) column name to index mapping in VectorizationContext is broken -- Key: HIVE-5817 URL: https://issues.apache.org/jira/browse/HIVE-5817 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Sergey Shelukhin Assignee: Remus Rusanu Priority: Critical Attachments: HIVE-5817-uniquecols.broken.patch, HIVE-5817.00-broken.patch Columns coming from different operators may have the same internal names (_colNN). There exists a query in the form {{select b.cb, a.ca from a JOIN b ON ... JOIN x ON ...;}} (distilled from a more complex query), which runs ok w/o vectorization. With vectorization, it will run ok for most ca, but for some ca it will fail (or can probably return incorrect results). That is because when building column-to-VRG-index map in VectorizationContext, internal column name for ca that the first map join operator adds to the mapping may be the same as internal name for cb that the 2nd one tries to add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to output stuff, it retrieves wrong index from the map by name, and then wrong vector from VRG. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: hcat tests with MVN
To answer the 2nd question: mvn surefire-report:report -Daggregate=true (or report-only if tests already ran) On Fri, Nov 22, 2013 at 10:12 AM, Eugene Koifman ekoif...@hortonworks.comwrote: Hi, I've noticed a couple of problems. 1) running mvn tests from hcatalog/ runs the tests under core and hcatalog-pig-adapter submodules but not any of the other modules (webhcat/java-client, webhcat/svr, etc). Though if I cd to the appropriate submodule ant run mvn test the tests are run. 2) mvn surefire-report:report from hcatalog generates .html files with test results in the ./target/site/surefire-report.html of each submodule, but not a single .html that includes results for all tests. (hcatalog/target/site/surefire-report.html is generated but contains 0 tests) Does anyone have suggestions on how to fix these? Thanks, Eugene -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Updated] (HIVE-5870) Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2
[ https://issues.apache.org/jira/browse/HIVE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-5870: Attachment: HIVE-5870.patch Attaching the patch. Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2 -- Key: HIVE-5870 URL: https://issues.apache.org/jira/browse/HIVE-5870 Project: Hive Issue Type: Bug Components: Tests Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-5870.patch TestJDBCDriver2.testNewConnectionConfiguration() attempts to start a Hiveserver2 instance in the test. This can cause issues as creating HiveServer2 needs correct environment/path. This test should be moved to TestJdbcWithMiniHS2, which uses MiniHS2. MiniHS2 is for this purpose (setting all the environment properly before starting HiveServer2 instance). -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15797: HIVE-5870 - Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15797/ --- Review request for hive. Bugs: HIVE-5870 https://issues.apache.org/jira/browse/HIVE-5870 Repository: hive-git Description --- TestJDBCDriver2.testNewConnectionConfiguration() attempts to start a Hiveserver2 instance and test connect to it. This can cause issues as creating HiveServer2 needs correct environment. This test should be moved to TestJdbcWithMiniHS2, which uses MiniHS2. MiniHS2 is for this purpose, as it sets all the environment properly before starting HiveServer2 instance. This test now runs the same commands against the MiniHS2. In the course of refactoring, also changed TestJdbcWithMiniHS2's MiniHS2 creation from @Before to @BeforeClass (ie, once per test), as calling init() multiple times on the HiveMetastore causes strange errors from DataNucleus/embedded Derby. Also, it is more efficient. Diffs - itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 7b1c9da itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 6c25736 Diff: https://reviews.apache.org/r/15797/diff/ Testing --- Ran affected unit tests. Thanks, Szehon Ho
[jira] [Updated] (HIVE-5849) Improve the stats of operators based on heuristics in the absence of any column statistics
[ https://issues.apache.org/jira/browse/HIVE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5849: - Attachment: HIVE-5849.7.patch removed changes to dynamic_partition_skip_default.q which caused the test failure. Improve the stats of operators based on heuristics in the absence of any column statistics -- Key: HIVE-5849 URL: https://issues.apache.org/jira/browse/HIVE-5849 Project: Hive Issue Type: Sub-task Components: Query Processor, Statistics Reporter: Prasanth J Assignee: Prasanth J Fix For: 0.13.0 Attachments: HIVE-5849.1.patch.txt, HIVE-5849.2.patch.txt, HIVE-5849.3.patch, HIVE-5849.3.patch.txt, HIVE-5849.4.javaonly.patch, HIVE-5849.5.patch, HIVE-5849.6.patch, HIVE-5849.7.patch In the absence of any column statistics, operators will simply use the statistics from its parents. It is useful to apply some heuristics to update basic statistics (number of rows and data size) in the absence of any column statistics. This will be worst case scenario. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5870) Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2
[ https://issues.apache.org/jira/browse/HIVE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-5870: Affects Version/s: 0.13.0 Status: Patch Available (was: Open) Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2 -- Key: HIVE-5870 URL: https://issues.apache.org/jira/browse/HIVE-5870 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-5870.patch TestJDBCDriver2.testNewConnectionConfiguration() attempts to start a Hiveserver2 instance in the test. This can cause issues as creating HiveServer2 needs correct environment/path. This test should be moved to TestJdbcWithMiniHS2, which uses MiniHS2. MiniHS2 is for this purpose (setting all the environment properly before starting HiveServer2 instance). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HIVE-3181) getDatabaseMajor/Minor version does not return values
[ https://issues.apache.org/jira/browse/HIVE-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho reassigned HIVE-3181: --- Assignee: Szehon Ho getDatabaseMajor/Minor version does not return values - Key: HIVE-3181 URL: https://issues.apache.org/jira/browse/HIVE-3181 Project: Hive Issue Type: Improvement Components: JDBC Reporter: N Campbell Assignee: Szehon Ho Fix For: 0.8.1 This is really a sub-issue of HIVE-3174 (which is a lot of properties) but given that the driver will return databaseProductVersion it makes no sense to not have implemented these as well. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5866) Hive divide operator generates wrong results in certain cases
[ https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5866: -- Attachment: HIVE-5866.1.patch Hive divide operator generates wrong results in certain cases - Key: HIVE-5866 URL: https://issues.apache.org/jira/browse/HIVE-5866 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5866.1.patch, HIVE-5866.patch Current GenericUDFOPDivide seems having a bug. The following query generates NULL result. {code} hive select 4BD / 25BD from test limit 1; ... Total MapReduce CPU Time Spent: 890 msec OK NULL Time taken: 7.901 seconds, Fetched: 1 row(s) {code} The correct result should be 0.16 in this query. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5866) Hive divide operator generates wrong results in certain cases
[ https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830292#comment-13830292 ] Hive QA commented on HIVE-5866: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12615378/HIVE-5866.1.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 4681 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_short_regress {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/400/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/400/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12615378 Hive divide operator generates wrong results in certain cases - Key: HIVE-5866 URL: https://issues.apache.org/jira/browse/HIVE-5866 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5866.1.patch, HIVE-5866.patch Current GenericUDFOPDivide seems having a bug. The following query generates NULL result. {code} hive select 4BD / 25BD from test limit 1; ... Total MapReduce CPU Time Spent: 890 msec OK NULL Time taken: 7.901 seconds, Fetched: 1 row(s) {code} The correct result should be 0.16 in this query. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5873) SubQuery: In subquery Count Bug
Harish Butani created HIVE-5873: --- Summary: SubQuery: In subquery Count Bug Key: HIVE-5873 URL: https://issues.apache.org/jira/browse/HIVE-5873 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani This is from the Optimization of Nested SQl Queries Revisited paper: http://dl.acm.org/citation.cfm?id=38723 Consider Part table having: {noformat} PNum OrderOnHand -- 3 6 101 8 0 {noformat} Supply table having: {noformat} PNum Qty 3 4 3 2 101 {noformat} The query: {noformat} select pnum from parts p where orderOnHand in (select count(*) from supply s where s.pnum = p.pnum ) {noformat} should return the row with PNum=8. But a transformation to a semi-join would eliminate this row, as there are no rows in supply table with PNum=8. AS shown in the paper the soln is to transform to: {noformat} select pnum from parts p semijoin (select p1.pnum, count(*) as c from (select distinct pnum from parts) p1 join supply s where s.pnum = p1.pnum ) sq on p.pnum = sq.pnum and p.orderOnHand = sq.c {noformat} The additional distinct query within the SubQuery is to handle duplicates in the outer query on the joining columns. -- This message was sent by Atlassian JIRA (v6.1#6144)
Hive-trunk-hadoop2 - Build # 564 - Still Failing
Changes for Build #533 [hashutosh] HIVE-3959 : Update Partition Statistics in Metastore Layer (Ashutosh Chauhan, Bhushan Mandhani, Gang Tim Liu via Thejas Nair) Changes for Build #534 [hashutosh] HIVE-5503 : TopN optimization in VectorReduceSink (Sergey Shelukhin via Ashutosh Chauhan) [brock] HIVE-5695 - PTest2 fix shutdown, duplicate runs, and add client retry [brock] HIVE-5708 - PTest2 should trim long logs when posting to jira Changes for Build #535 [thejas] HIVE-5547 : webhcat pig job submission should ship hive tar if -usehcatalog is specified (Eugene Koifman via Thejas Nair) [thejas] HIVE-5715 : HS2 should not start a session for every command (Gunther Hagleitner via Thejas Nair) Changes for Build #536 Changes for Build #537 [brock] HIVE-5740: Tar files should extract to the directory of the same name minus tar.gz (Brock Noland reviewed by Xuefu Zhang) [brock] HIVE-5611: Add assembly (i.e.) tar creation to pom (Szehon Ho via Brock Noland) [brock] HIVE-5707: Validate values for ConfVar (Navis via Brock Noland) [brock] HIVE-5721: Incremental build is disabled by MCOMPILER-209 (Navis via Brock Noland) Changes for Build #538 [brock] HIVE-5354 - Decimal precision/scale support in ORC file (Xuefu Zhang via Brock Noland) [brock] HIVE-4523 - round() function with specified decimal places not consistent with mysql (Xuefu Zhang via Brock Noland) [thejas] HIVE-5542 : Webhcat is failing to run ddl command on a secure cluster (Sushanth Sowmyan via Thejas Nair) Changes for Build #539 [brock] HIVE-5730: Beeline throws non-terminal NPE upon starting, after mavenization (Szehon Ho reviewed by Navis) Changes for Build #540 [omalley] HIVE-5425 Provide a configuration option to control the default stripe size for ORC. (omalley reviewed by gunther) [omalley] Revert HIVE-5583 since it broke the build. [hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5355 - JDBC support for decimal precision/scale Changes for Build #541 [hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5773 - Fix build due to conflict between HIVE-5711 and HIVE-5713 [brock] HIVE-5711 - Fix eclipse:eclipse maven goal (Carl Steinbach via Brock Noland) [brock] HIVE-5752 - log4j properties appear to have been lost in maven upgrade (Sergey Shelukhin via Brock Noland) [brock] HIVE-5713 - Verify versions of libraries post maven merge (Brock Noland reviewed by Gunther Hagleitner) [brock] HIVE-5765 - Beeline throws NPE when -e option is used (Szehon Ho via Brock Noland) [xuefu] HIVE-5726: The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant (reviewed by Brock) [xuefu] HIVE-5655: Hive incorrecly handles divide-by-zero case (reviewed by Edward and Brock) [xuefu] HIVE-5191: Add char data type (Jason via Xuefu) Changes for Build #542 [brock] HIVE-5351 - Secure-Socket-Layer (SSL) support for HiveServer2 (Prasad Mujumdar via Brock Noland) Changes for Build #543 [brock] HIVE-5780 - Add the missing declaration of HIVE_CLI_SERVICE_PROTOCOL_V4 in TCLIService.thrift (Prasad Mujumdar via Brock Noland) Changes for Build #544 [gunther] HIVE-5601: NPE in ORC's PPD when using select * from table with where predicate (Prasanth J via Owen O'Malley and Gunther Hagleitner) [gunther] HIVE-5562: Provide stripe level column statistics in ORC (Patch by Prasanth J, reviewed by Owen O'Malley, committed by Gunther Hagleitner) [hashutosh] HIVE-3777 : add a property in the partition to figure out if stats are accurate (Ashutosh Chauhan via Thejas Nair) Changes for Build #545 [hashutosh] HIVE-5691 : Intermediate columns are incorrectly initialized for partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner) [hashutosh] HIVE-5779 : Subquery in where clause with distinct fails with mapjoin turned on with serialization error. (Ashutosh Chauhan via Harish Butani) [gunther] HIVE-5632 (partial): Adding test data to data/files to enable pre-commit tests to run. (Prasanth J via Gunther Hagleitner) Changes for Build #546 [cws] HIVE-5786: Remove HadoopShims methods that were needed for pre-Hadoop 0.20 (Jason Dere via cws) [thejas] HIVE-5229 : Better thread management for HiveServer2 async threads (Vaibhav Gumashta via Thejas Nair) [gunther] HIVE-5745: TestHiveLogging is failing (at least on mac) (Gunther Hagleitner, reviewed by Ashutosh Chauhan) Changes for Build #547 [hashutosh] HIVE-5699 : Add unit test for vectorized BETWEEN for timestamp inputs (Eric Hanson via Ashutosh Chauhan) [hashutosh] HIVE-5767 : in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT (Sergey Shelukhin via Ashutosh Chauhan) [hashutosh] HIVE-5657 : TopN produces incorrect results with count(distinct) (Sergey Shelukhin via Ashutosh Chauhan) Changes for Build
[jira] [Assigned] (HIVE-5758) Implement vectorized support for NOT IN filter
[ https://issues.apache.org/jira/browse/HIVE-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson reassigned HIVE-5758: - Assignee: Eric Hanson Implement vectorized support for NOT IN filter -- Key: HIVE-5758 URL: https://issues.apache.org/jira/browse/HIVE-5758 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Implement full, end-to-end support for NOT IN in vectorized mode, including new VectorExpression class(es), VectorizationContext translation to a VectorExpression, and unit tests for these, as well as end-to-end ad hoc testing. An end-to-end .q test is recommended. -- This message was sent by Atlassian JIRA (v6.1#6144)
Hive-trunk-h0.21 - Build # 2465 - Still Failing
Changes for Build #2434 [hashutosh] HIVE-3959 : Update Partition Statistics in Metastore Layer (Ashutosh Chauhan, Bhushan Mandhani, Gang Tim Liu via Thejas Nair) Changes for Build #2435 [hashutosh] HIVE-5503 : TopN optimization in VectorReduceSink (Sergey Shelukhin via Ashutosh Chauhan) [brock] HIVE-5695 - PTest2 fix shutdown, duplicate runs, and add client retry [brock] HIVE-5708 - PTest2 should trim long logs when posting to jira Changes for Build #2436 [thejas] HIVE-5547 : webhcat pig job submission should ship hive tar if -usehcatalog is specified (Eugene Koifman via Thejas Nair) [thejas] HIVE-5715 : HS2 should not start a session for every command (Gunther Hagleitner via Thejas Nair) Changes for Build #2437 Changes for Build #2438 [brock] HIVE-5740: Tar files should extract to the directory of the same name minus tar.gz (Brock Noland reviewed by Xuefu Zhang) [brock] HIVE-5611: Add assembly (i.e.) tar creation to pom (Szehon Ho via Brock Noland) [brock] HIVE-5707: Validate values for ConfVar (Navis via Brock Noland) [brock] HIVE-5721: Incremental build is disabled by MCOMPILER-209 (Navis via Brock Noland) Changes for Build #2439 [brock] HIVE-5354 - Decimal precision/scale support in ORC file (Xuefu Zhang via Brock Noland) [brock] HIVE-4523 - round() function with specified decimal places not consistent with mysql (Xuefu Zhang via Brock Noland) [thejas] HIVE-5542 : Webhcat is failing to run ddl command on a secure cluster (Sushanth Sowmyan via Thejas Nair) Changes for Build #2440 [brock] HIVE-5730: Beeline throws non-terminal NPE upon starting, after mavenization (Szehon Ho reviewed by Navis) Changes for Build #2441 [omalley] HIVE-5425 Provide a configuration option to control the default stripe size for ORC. (omalley reviewed by gunther) [omalley] Revert HIVE-5583 since it broke the build. [hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5355 - JDBC support for decimal precision/scale Changes for Build #2443 [brock] HIVE-5351 - Secure-Socket-Layer (SSL) support for HiveServer2 (Prasad Mujumdar via Brock Noland) [hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5773 - Fix build due to conflict between HIVE-5711 and HIVE-5713 [brock] HIVE-5711 - Fix eclipse:eclipse maven goal (Carl Steinbach via Brock Noland) [brock] HIVE-5752 - log4j properties appear to have been lost in maven upgrade (Sergey Shelukhin via Brock Noland) [brock] HIVE-5713 - Verify versions of libraries post maven merge (Brock Noland reviewed by Gunther Hagleitner) [brock] HIVE-5765 - Beeline throws NPE when -e option is used (Szehon Ho via Brock Noland) [xuefu] HIVE-5726: The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant (reviewed by Brock) [xuefu] HIVE-5655: Hive incorrecly handles divide-by-zero case (reviewed by Edward and Brock) [xuefu] HIVE-5191: Add char data type (Jason via Xuefu) Changes for Build #2444 [brock] HIVE-5780 - Add the missing declaration of HIVE_CLI_SERVICE_PROTOCOL_V4 in TCLIService.thrift (Prasad Mujumdar via Brock Noland) Changes for Build #2445 [gunther] HIVE-5601: NPE in ORC's PPD when using select * from table with where predicate (Prasanth J via Owen O'Malley and Gunther Hagleitner) [gunther] HIVE-5562: Provide stripe level column statistics in ORC (Patch by Prasanth J, reviewed by Owen O'Malley, committed by Gunther Hagleitner) [hashutosh] HIVE-3777 : add a property in the partition to figure out if stats are accurate (Ashutosh Chauhan via Thejas Nair) Changes for Build #2446 [hashutosh] HIVE-5691 : Intermediate columns are incorrectly initialized for partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner) [hashutosh] HIVE-5779 : Subquery in where clause with distinct fails with mapjoin turned on with serialization error. (Ashutosh Chauhan via Harish Butani) [gunther] HIVE-5632 (partial): Adding test data to data/files to enable pre-commit tests to run. (Prasanth J via Gunther Hagleitner) Changes for Build #2447 [cws] HIVE-5786: Remove HadoopShims methods that were needed for pre-Hadoop 0.20 (Jason Dere via cws) [thejas] HIVE-5229 : Better thread management for HiveServer2 async threads (Vaibhav Gumashta via Thejas Nair) [gunther] HIVE-5745: TestHiveLogging is failing (at least on mac) (Gunther Hagleitner, reviewed by Ashutosh Chauhan) Changes for Build #2448 [hashutosh] HIVE-5699 : Add unit test for vectorized BETWEEN for timestamp inputs (Eric Hanson via Ashutosh Chauhan) [hashutosh] HIVE-5767 : in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT (Sergey Shelukhin via Ashutosh Chauhan) [hashutosh] HIVE-5657 : TopN produces incorrect results with count(distinct) (Sergey Shelukhin via Ashutosh Chauhan) Changes for Build #2450
[jira] [Updated] (HIVE-5866) Hive divide operator generates wrong results in certain cases
[ https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5866: -- Attachment: HIVE-5866.2.patch Patch #2 fixed the failed test cases. Hive divide operator generates wrong results in certain cases - Key: HIVE-5866 URL: https://issues.apache.org/jira/browse/HIVE-5866 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5866.1.patch, HIVE-5866.2.patch, HIVE-5866.patch Current GenericUDFOPDivide seems having a bug. The following query generates NULL result. {code} hive select 4BD / 25BD from test limit 1; ... Total MapReduce CPU Time Spent: 890 msec OK NULL Time taken: 7.901 seconds, Fetched: 1 row(s) {code} The correct result should be 0.16 in this query. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15804: HIVE-5866: Hive divide operator generates wrong results in certain cases
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15804/ --- Review request for hive. Bugs: HIVE-5866 https://issues.apache.org/jira/browse/HIVE-5866 Repository: hive-git Description --- Fixed the problem. Added a unit test. Corrected the output of a few q tests. Diffs - ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java a1015e9 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java 0b902e9 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMinus.java 538c07e ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMod.java 472e1dd ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMultiply.java 2e8d364 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPPlus.java 35f639e ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPosMod.java 6b18303 ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFOPDivide.java 581c1a8 ql/src/test/results/clientpositive/decimal_precision.q.out 2ee3578 ql/src/test/results/clientpositive/decimal_udf.q.out ed5bc65 ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out 83787ee ql/src/test/results/clientpositive/vectorization_5.q.out 54aad90 ql/src/test/results/clientpositive/vectorization_short_regress.q.out c9296e1 Diff: https://reviews.apache.org/r/15804/diff/ Testing --- Thanks, Xuefu Zhang
[jira] [Commented] (HIVE-5866) Hive divide operator generates wrong results in certain cases
[ https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830342#comment-13830342 ] Xuefu Zhang commented on HIVE-5866: --- An issue regarding UDAFs was identified and JIRA HIVE-5872 is logged. Hive divide operator generates wrong results in certain cases - Key: HIVE-5866 URL: https://issues.apache.org/jira/browse/HIVE-5866 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5866.1.patch, HIVE-5866.2.patch, HIVE-5866.patch Current GenericUDFOPDivide seems having a bug. The following query generates NULL result. {code} hive select 4BD / 25BD from test limit 1; ... Total MapReduce CPU Time Spent: 890 msec OK NULL Time taken: 7.901 seconds, Fetched: 1 row(s) {code} The correct result should be 0.16 in this query. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4485) beeline prints null as empty strings
[ https://issues.apache.org/jira/browse/HIVE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830354#comment-13830354 ] Thejas M Nair commented on HIVE-4485: - bq. Hive 0.11.0 was released with the current behavior (nulls printed as NULL) [~cwsteinbach] I want to clarify that this is about beeline and beeline currently prints nulls as emtpy strings. I agree that switching this would be a backward incompatible change. But I think it is important to distinguish between empty strings and null for obvious reasons. I think this is a general problem - Hive has undesirable defaults in some cases and changing those would break backward compatibility. I think we should give the users (specially new users), the option of using the more sensible but backward incompatible configuration defaults. I will open a new jira for that. beeline prints null as empty strings Key: HIVE-4485 URL: https://issues.apache.org/jira/browse/HIVE-4485 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4485.1.patch, HIVE-4485.2.patch beeline is printing nulls as emtpy strings. This is inconsistent with hive cli and other databases, they print null as NULL string. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5810) create a function add_date as exists in mysql
[ https://issues.apache.org/jira/browse/HIVE-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830355#comment-13830355 ] Anandha L Ranganathan commented on HIVE-5810: - looking into that. create a function add_date as exists in mysql Key: HIVE-5810 URL: https://issues.apache.org/jira/browse/HIVE-5810 Project: Hive Issue Type: Improvement Reporter: Anandha L Ranganathan Assignee: Anandha L Ranganathan Attachments: HIVE-5810.patch Original Estimate: 40h Remaining Estimate: 40h MySQL has ADDDATE(date,INTERVAL expr unit). Similarly in Hive we can have (date,unit,expr). Here Unit is DAY/Month/Year For example, add_date('2013-11-09','DAY',2) will return 2013-11-11. add_date('2013-11-09','Month',2) will return 2014-01-09. add_date('2013-11-09','Year',2) will return 2014-11-11. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3815) hive table rename fails if filesystem cache is disabled
[ https://issues.apache.org/jira/browse/HIVE-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-3815: Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Thanks for the review Navis. Patch committed to trunk. hive table rename fails if filesystem cache is disabled --- Key: HIVE-3815 URL: https://issues.apache.org/jira/browse/HIVE-3815 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.13.0 Attachments: HIVE-3815.1.patch If fs.filesyste.impl.disable.cache (eg fs.hdfs.impl.disable.cache) is set to true, then table rename fails. The exception that gets thrown (though not logged!) is {quote} Caused by: InvalidOperationException(message:table new location hdfs://host1:8020/apps/hive/warehouse/t2 is on a different file system than the old location hdfs://host1:8020/apps/hive/warehouse/t1. This operation is not supported) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_result$alter_table_resultStandardScheme.read(ThriftHiveMetastore.java:28825) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_result$alter_table_resultStandardScheme.read(ThriftHiveMetastore.java:28811) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_result.read(ThriftHiveMetastore.java:28753) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table(ThriftHiveMetastore.java:977) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table(ThriftHiveMetastore.java:962) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:208) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74) at $Proxy7.alter_table(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:373) ... 18 more {quote} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5874) SubQuery: better error handling when SQ and Outer Query has the same table alias
Harish Butani created HIVE-5874: --- Summary: SubQuery: better error handling when SQ and Outer Query has the same table alias Key: HIVE-5874 URL: https://issues.apache.org/jira/browse/HIVE-5874 Project: Hive Issue Type: Bug Reporter: Harish Butani Priority: Minor The following query {noformat} select * from src where key in (select key from src where src.key '1') {noformat} Gives the following message: {noformat} emanticException [Error 10249]: Line 1:58 Unsupported SubQuery Expression ''1'': SubQuery expression refers to Outer query expressions only. {noformat} Whereas the user is attempting to express an uncorrelated Subquery. The ambiguity is because we attempt to resolve references against the Outer Query first. This is an implementation detail, see the Sub Query spec for details. For now it is better to disallow such SubQueries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5839) BytesRefArrayWritable compareTo violates contract
[ https://issues.apache.org/jira/browse/HIVE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5839: -- Attachment: HIVE-5839.patch Reloaded the same patch to rerun the test. BytesRefArrayWritable compareTo violates contract - Key: HIVE-5839 URL: https://issues.apache.org/jira/browse/HIVE-5839 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0, 0.12.0 Reporter: Ian Robertson Assignee: Xuefu Zhang Attachments: HIVE-5839.patch, HIVE-5839.patch BytesRefArrayWritable's compareTo violates the compareTo contract from java.lang.Object. Specifically: * The implementor must ensure sgn(x.compareTo( y )) == -sgn(y.compareTo( x )) for all x and y. The compareTo implementation on BytesRefArrayWritable does a proper comparison of the sizes of the two instances. However, if the sizes are the same, it proceeds to do a check if both array's have the same constant. If not, it returns 1. This means that if x and y are two BytesRefArrayWritable instances with the same size, but different contents, then x.compareTo( y ) == 1 and y.compareTo( x ) == 1. Additionally, the comparison of contents is order agnostic. This seems wrong, since order of entries should matter. It is also very inefficient, running at O(n^2), where n is the number of entries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5866) Hive divide operator generates wrong results in certain cases
[ https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830376#comment-13830376 ] Hive QA commented on HIVE-5866: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12615390/HIVE-5866.2.patch {color:green}SUCCESS:{color} +1 4681 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/401/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/401/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12615390 Hive divide operator generates wrong results in certain cases - Key: HIVE-5866 URL: https://issues.apache.org/jira/browse/HIVE-5866 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5866.1.patch, HIVE-5866.2.patch, HIVE-5866.patch Current GenericUDFOPDivide seems having a bug. The following query generates NULL result. {code} hive select 4BD / 25BD from test limit 1; ... Total MapReduce CPU Time Spent: 890 msec OK NULL Time taken: 7.901 seconds, Fetched: 1 row(s) {code} The correct result should be 0.16 in this query. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5866) Hive divide operator generates wrong results in certain cases
[ https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830383#comment-13830383 ] Jason Dere commented on HIVE-5866: -- What exactly was causing the null result in this case? I'll try to take a look at the patch in a bit. Hive divide operator generates wrong results in certain cases - Key: HIVE-5866 URL: https://issues.apache.org/jira/browse/HIVE-5866 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5866.1.patch, HIVE-5866.2.patch, HIVE-5866.patch Current GenericUDFOPDivide seems having a bug. The following query generates NULL result. {code} hive select 4BD / 25BD from test limit 1; ... Total MapReduce CPU Time Spent: 890 msec OK NULL Time taken: 7.901 seconds, Fetched: 1 row(s) {code} The correct result should be 0.16 in this query. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5875) task : collect list of hive configuration params whose default should change
Thejas M Nair created HIVE-5875: --- Summary: task : collect list of hive configuration params whose default should change Key: HIVE-5875 URL: https://issues.apache.org/jira/browse/HIVE-5875 Project: Hive Issue Type: Task Reporter: Thejas M Nair Assignee: Thejas M Nair The immediate motivation for this was the ticket HIVE-4485 . Beeline prints NULLs as empty strings. This is not a desirable behavior. But if we fix it, it breaks backward compatibility. But we should not be burdening all users with mistakes of the past, specially the users who are new to hive. As hadoop and hive adoption increases proportion of 'new' users will continue to increase. We need a way to let users choose between backward compatible behavior and more sensible behavior. How this is implemented can be discussed in a separate jira. The purpose of this *Task* jira is just to collect list of config flags whose current default is not the desirable one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5875) task : collect list of hive configuration params whose default should change
[ https://issues.apache.org/jira/browse/HIVE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830393#comment-13830393 ] Edward Capriolo commented on HIVE-5875: --- hive.mapred.mode=strict hive.cli.print.header=true auotcreate.scheam=false task : collect list of hive configuration params whose default should change Key: HIVE-5875 URL: https://issues.apache.org/jira/browse/HIVE-5875 Project: Hive Issue Type: Task Reporter: Thejas M Nair Assignee: Thejas M Nair The immediate motivation for this was the ticket HIVE-4485 . Beeline prints NULLs as empty strings. This is not a desirable behavior. But if we fix it, it breaks backward compatibility. But we should not be burdening all users with mistakes of the past, specially the users who are new to hive. As hadoop and hive adoption increases proportion of 'new' users will continue to increase. We need a way to let users choose between backward compatible behavior and more sensible behavior. How this is implemented can be discussed in a separate jira. The purpose of this *Task* jira is just to collect list of config flags whose current default is not the desirable one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5875) task : collect list of hive configuration params whose default should change
[ https://issues.apache.org/jira/browse/HIVE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830405#comment-13830405 ] Thejas M Nair commented on HIVE-5875: - hive.enforce.bucketing = true . (When somebody creates a table with bucketing , I don't see any reason why they won't want bucketing to be on by default). task : collect list of hive configuration params whose default should change Key: HIVE-5875 URL: https://issues.apache.org/jira/browse/HIVE-5875 Project: Hive Issue Type: Task Reporter: Thejas M Nair Assignee: Thejas M Nair The immediate motivation for this was the ticket HIVE-4485 . Beeline prints NULLs as empty strings. This is not a desirable behavior. But if we fix it, it breaks backward compatibility. But we should not be burdening all users with mistakes of the past, specially the users who are new to hive. As hadoop and hive adoption increases proportion of 'new' users will continue to increase. We need a way to let users choose between backward compatible behavior and more sensible behavior. How this is implemented can be discussed in a separate jira. The purpose of this *Task* jira is just to collect list of config flags whose current default is not the desirable one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5839) BytesRefArrayWritable compareTo violates contract
[ https://issues.apache.org/jira/browse/HIVE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830416#comment-13830416 ] Hive QA commented on HIVE-5839: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12615394/HIVE-5839.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4652 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.io.TestRCFile.testWriteAndPartialRead {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/402/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/402/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12615394 BytesRefArrayWritable compareTo violates contract - Key: HIVE-5839 URL: https://issues.apache.org/jira/browse/HIVE-5839 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0, 0.12.0 Reporter: Ian Robertson Assignee: Xuefu Zhang Attachments: HIVE-5839.patch, HIVE-5839.patch BytesRefArrayWritable's compareTo violates the compareTo contract from java.lang.Object. Specifically: * The implementor must ensure sgn(x.compareTo( y )) == -sgn(y.compareTo( x )) for all x and y. The compareTo implementation on BytesRefArrayWritable does a proper comparison of the sizes of the two instances. However, if the sizes are the same, it proceeds to do a check if both array's have the same constant. If not, it returns 1. This means that if x and y are two BytesRefArrayWritable instances with the same size, but different contents, then x.compareTo( y ) == 1 and y.compareTo( x ) == 1. Additionally, the comparison of contents is order agnostic. This seems wrong, since order of entries should matter. It is also very inefficient, running at O(n^2), where n is the number of entries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken
[ https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830430#comment-13830430 ] Ashutosh Chauhan commented on HIVE-5817: [~ehans] I tried your query. With that I can repro this on a 1-node cluster, but when I use this query in .q file and run mvn test, test actually passes. Any idea why? column name to index mapping in VectorizationContext is broken -- Key: HIVE-5817 URL: https://issues.apache.org/jira/browse/HIVE-5817 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Sergey Shelukhin Assignee: Remus Rusanu Priority: Critical Attachments: HIVE-5817-uniquecols.broken.patch, HIVE-5817.00-broken.patch Columns coming from different operators may have the same internal names (_colNN). There exists a query in the form {{select b.cb, a.ca from a JOIN b ON ... JOIN x ON ...;}} (distilled from a more complex query), which runs ok w/o vectorization. With vectorization, it will run ok for most ca, but for some ca it will fail (or can probably return incorrect results). That is because when building column-to-VRG-index map in VectorizationContext, internal column name for ca that the first map join operator adds to the mapping may be the same as internal name for cb that the 2nd one tries to add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to output stuff, it retrieves wrong index from the map by name, and then wrong vector from VRG. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5833) Remove versions from child module dependencies
[ https://issues.apache.org/jira/browse/HIVE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830442#comment-13830442 ] Hive QA commented on HIVE-5833: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614944/HIVE-5833.2.patch {color:green}SUCCESS:{color} +1 4680 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/403/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/403/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12614944 Remove versions from child module dependencies -- Key: HIVE-5833 URL: https://issues.apache.org/jira/browse/HIVE-5833 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Attachments: HIVE-5833.2.patch, HIVE-5833.patch HIVE-5741 moved all dependencies to the plugin management section of the parent pom therefore we can remove {noformat}version${dep.version}/version{noformat} from all dependencies in child modules. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Review Request 15718: HIVE-5614: Subquery support: allow subquery expressions in having clause
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15718/ --- (Updated Nov. 23, 2013, 12:14 a.m.) Review request for hive and Ashutosh Chauhan. Changes --- Thanks for the feedback. attempted to address all the feedback. Bugs: HIVE-5614 https://issues.apache.org/jira/browse/HIVE-5614 Repository: hive-git Description --- support for subquery predicates in having clause. SubTask of HIVE-784 Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java fa111cc ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java 3e8215d ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7979873 ql/src/test/queries/clientpositive/subquery_exists_having.q PRE-CREATION ql/src/test/queries/clientpositive/subquery_in_having.q PRE-CREATION ql/src/test/queries/clientpositive/subquery_notexists_having.q PRE-CREATION ql/src/test/queries/clientpositive/subquery_notin_having.q PRE-CREATION ql/src/test/results/clientpositive/subquery_exists_having.q.out PRE-CREATION ql/src/test/results/clientpositive/subquery_in_having.q.out PRE-CREATION ql/src/test/results/clientpositive/subquery_multiinsert.q.out 8dfb485 ql/src/test/results/clientpositive/subquery_notexists_having.q.out PRE-CREATION ql/src/test/results/clientpositive/subquery_notin_having.q.out PRE-CREATION Diff: https://reviews.apache.org/r/15718/diff/ Testing --- added new tests: subquery_in_having.q, subquery_notin_having.q, subquery_exists_having.q, subquery_notexists_having.q Thanks, Harish Butani
[jira] [Updated] (HIVE-5614) Subquery support: allow subquery expressions in having clause
[ https://issues.apache.org/jira/browse/HIVE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-5614: Attachment: HIVE-5614.4.patch Subquery support: allow subquery expressions in having clause - Key: HIVE-5614 URL: https://issues.apache.org/jira/browse/HIVE-5614 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5614.1.patch, HIVE-5614.2.patch, HIVE-5614.3.patch, HIVE-5614.4.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Review Request 15718: HIVE-5614: Subquery support: allow subquery expressions in having clause
On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote: ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java, line 131 https://reviews.apache.org/r/15718/diff/1/?file=388900#file388900line131 It will be good to add a comment about various fields in Conjunct class. done On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote: ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java, line 138 https://reviews.apache.org/r/15718/diff/1/?file=388900#file388900line138 Can this constructor be package protected instead of public ? done On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote: ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java, line 203 https://reviews.apache.org/r/15718/diff/1/?file=388900#file388900line203 It will be good to add a comment, how behavior of ConjunctAnalyzer changes when forHavingClause = true instead of false. done On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote: ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java, line 298 https://reviews.apache.org/r/15718/diff/1/?file=388900#file388900line298 Should this exception needs to be propagated up the stack. At the least, we should have LOG.warn() message here. this is not an error. This is only a check if the expression is in the OuterQuery RR On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 1916 https://reviews.apache.org/r/15718/diff/1/?file=388901#file388901line1916 It will be good to add a comment here along the lines of there could be a subq in having clause, if so we need to generate subq plan followed by semi-join. done On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 1926 https://reviews.apache.org/r/15718/diff/1/?file=388901#file388901line1926 It will be good to add a comment how this new boolean changes behavior of this method. done On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote: ql/src/test/queries/clientpositive/subquery_in_having.q, line 25 https://reviews.apache.org/r/15718/diff/1/?file=388903#file388903line25 It will be good to add a test which has a subq in both where clause as well as having clause done On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote: ql/src/test/queries/clientpositive/subquery_in_having.q, line 63 https://reviews.apache.org/r/15718/diff/1/?file=388903#file388903line63 Same comment w.r.t map-join on. Also, if we support over clause in subq, it will be good to have a test for that. done On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote: ql/src/test/queries/clientpositive/subquery_notexists_having.q, line 46 https://reviews.apache.org/r/15718/diff/1/?file=388904#file388904line46 It will be good to add a negative test where subq and outer query both uses same table alias. It seems in such cases we may generate incorrect results, so we should disable those. this is not just for having. Added a jira HIVE-5874 to give a better error for this case. On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote: ql/src/test/results/clientpositive/subquery_in_having.q.out, line 57 https://reviews.apache.org/r/15718/diff/1/?file=388907#file388907line57 In this plan, we are first computing outq, then subq and then doing left semi-join on resultset of those two. As we discussed efficient way for this is to push filter conditions in subq to outer query to cut-down the output generated by outq. Though, I am not sure whether its better to do it in optimizer phase via Transformer or right here. Either ways, I think thats an optimization which we can do as a follow-up. agreed On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote: ql/src/test/results/clientpositive/subquery_notexists_having.q.out, line 149 https://reviews.apache.org/r/15718/diff/1/?file=388909#file388909line149 First expression in this filter is redundant. Thats not strictly required. However, since there is an active work going on for constant folding optimization, this may get optimized way via that optimization. Either way, this can be done in follow-up. (1=1) is added as a placeholder. Yes it should be removed. - Harish --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15718/#review29298 --- On Nov. 23, 2013, 12:14 a.m., Harish Butani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15718/ --- (Updated Nov. 23, 2013, 12:14 a.m.) Review request for hive and Ashutosh Chauhan. Bugs: HIVE-5614
[jira] [Updated] (HIVE-5614) Subquery support: allow subquery expressions in having clause
[ https://issues.apache.org/jira/browse/HIVE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-5614: Status: Patch Available (was: Open) Subquery support: allow subquery expressions in having clause - Key: HIVE-5614 URL: https://issues.apache.org/jira/browse/HIVE-5614 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5614.1.patch, HIVE-5614.2.patch, HIVE-5614.3.patch, HIVE-5614.4.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5614) Subquery support: allow subquery expressions in having clause
[ https://issues.apache.org/jira/browse/HIVE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830446#comment-13830446 ] Harish Butani commented on HIVE-5614: - update based on feedback from [~ashutoshc]. Subquery support: allow subquery expressions in having clause - Key: HIVE-5614 URL: https://issues.apache.org/jira/browse/HIVE-5614 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5614.1.patch, HIVE-5614.2.patch, HIVE-5614.3.patch, HIVE-5614.4.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5876) Split elimination in ORC breaks for partitioned tables
Prasanth J created HIVE-5876: Summary: Split elimination in ORC breaks for partitioned tables Key: HIVE-5876 URL: https://issues.apache.org/jira/browse/HIVE-5876 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J HIVE-5632 eliminates ORC stripes from split computation that do not satisfy SARG condition. SARG expression can also refer to partition columns. But partition column will not be contained in the column names list in ORC file. This was causing ArrayIndexOutOfBoundException in split elimination logic when used with partitioned tables. The fix is to ignore evaluation of partition column expressions in split elimination. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5877) Implement vectorized support for IN as boolean-valued expression
Eric Hanson created HIVE-5877: - Summary: Implement vectorized support for IN as boolean-valued expression Key: HIVE-5877 URL: https://issues.apache.org/jira/browse/HIVE-5877 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Eric Hanson Implement support for IN as a Boolean-valued expression, e..g. select col1 IN (1, 2, 3) from T; or select col1 from T where NOT (col1 IN (1, 2, 3)); This will also automatically add support for NOT IN because NOT IN is automatically transformed into NOT ( ... IN ... ) by the parser. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5758) Implement vectorized support for NOT IN filter
[ https://issues.apache.org/jira/browse/HIVE-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830452#comment-13830452 ] Eric Hanson commented on HIVE-5758: --- It turns out that the parser transforms col NOT IN (list) to NOT (col IN (list)) So when support for a IN as a Boolean expression is added, this should just work. Implement vectorized support for NOT IN filter -- Key: HIVE-5758 URL: https://issues.apache.org/jira/browse/HIVE-5758 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Implement full, end-to-end support for NOT IN in vectorized mode, including new VectorExpression class(es), VectorizationContext translation to a VectorExpression, and unit tests for these, as well as end-to-end ad hoc testing. An end-to-end .q test is recommended. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5876) Split elimination in ORC breaks for partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5876: - Attachment: HIVE-5876.1.patch Split elimination in ORC breaks for partitioned tables -- Key: HIVE-5876 URL: https://issues.apache.org/jira/browse/HIVE-5876 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5876.1.patch HIVE-5632 eliminates ORC stripes from split computation that do not satisfy SARG condition. SARG expression can also refer to partition columns. But partition column will not be contained in the column names list in ORC file. This was causing ArrayIndexOutOfBoundException in split elimination logic when used with partitioned tables. The fix is to ignore evaluation of partition column expressions in split elimination. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken
[ https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830459#comment-13830459 ] Ashutosh Chauhan commented on HIVE-5817: Aah.. I missed one config which is required to repro in .q file {{ set hive.auto.convert.join=true; }} Thanks [~ehans] for test-case. column name to index mapping in VectorizationContext is broken -- Key: HIVE-5817 URL: https://issues.apache.org/jira/browse/HIVE-5817 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Sergey Shelukhin Assignee: Remus Rusanu Priority: Critical Attachments: HIVE-5817-uniquecols.broken.patch, HIVE-5817.00-broken.patch Columns coming from different operators may have the same internal names (_colNN). There exists a query in the form {{select b.cb, a.ca from a JOIN b ON ... JOIN x ON ...;}} (distilled from a more complex query), which runs ok w/o vectorization. With vectorization, it will run ok for most ca, but for some ca it will fail (or can probably return incorrect results). That is because when building column-to-VRG-index map in VectorizationContext, internal column name for ca that the first map join operator adds to the mapping may be the same as internal name for cb that the 2nd one tries to add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to output stuff, it retrieves wrong index from the map by name, and then wrong vector from VRG. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken
[ https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830465#comment-13830465 ] Ashutosh Chauhan commented on HIVE-5817: [~rusanu] Another approach I am mulling over is to prepend table alias in the column name in that map. That way keys in that map will be unique across different tables, so that they won't collide. Change there is all the callers also need to prepend table alias than, but since all of them have ExprNodeDesc which has table alias, this should work out fine. What do you think ? column name to index mapping in VectorizationContext is broken -- Key: HIVE-5817 URL: https://issues.apache.org/jira/browse/HIVE-5817 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Sergey Shelukhin Assignee: Remus Rusanu Priority: Critical Attachments: HIVE-5817-uniquecols.broken.patch, HIVE-5817.00-broken.patch Columns coming from different operators may have the same internal names (_colNN). There exists a query in the form {{select b.cb, a.ca from a JOIN b ON ... JOIN x ON ...;}} (distilled from a more complex query), which runs ok w/o vectorization. With vectorization, it will run ok for most ca, but for some ca it will fail (or can probably return incorrect results). That is because when building column-to-VRG-index map in VectorizationContext, internal column name for ca that the first map join operator adds to the mapping may be the same as internal name for cb that the 2nd one tries to add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to output stuff, it retrieves wrong index from the map by name, and then wrong vector from VRG. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5614) Subquery support: allow subquery expressions in having clause
[ https://issues.apache.org/jira/browse/HIVE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830471#comment-13830471 ] Ashutosh Chauhan commented on HIVE-5614: +1 Subquery support: allow subquery expressions in having clause - Key: HIVE-5614 URL: https://issues.apache.org/jira/browse/HIVE-5614 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5614.1.patch, HIVE-5614.2.patch, HIVE-5614.3.patch, HIVE-5614.4.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5849) Improve the stats of operators based on heuristics in the absence of any column statistics
[ https://issues.apache.org/jira/browse/HIVE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5849: - Attachment: (was: HIVE-5849.7.patch) Improve the stats of operators based on heuristics in the absence of any column statistics -- Key: HIVE-5849 URL: https://issues.apache.org/jira/browse/HIVE-5849 Project: Hive Issue Type: Sub-task Components: Query Processor, Statistics Reporter: Prasanth J Assignee: Prasanth J Fix For: 0.13.0 Attachments: HIVE-5849.1.patch.txt, HIVE-5849.2.patch.txt, HIVE-5849.3.patch, HIVE-5849.3.patch.txt, HIVE-5849.4.javaonly.patch, HIVE-5849.5.patch, HIVE-5849.6.patch In the absence of any column statistics, operators will simply use the statistics from its parents. It is useful to apply some heuristics to update basic statistics (number of rows and data size) in the absence of any column statistics. This will be worst case scenario. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5849) Improve the stats of operators based on heuristics in the absence of any column statistics
[ https://issues.apache.org/jira/browse/HIVE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5849: - Attachment: HIVE-5849.7.patch Reuploading patch for precommit test to pickup. Improve the stats of operators based on heuristics in the absence of any column statistics -- Key: HIVE-5849 URL: https://issues.apache.org/jira/browse/HIVE-5849 Project: Hive Issue Type: Sub-task Components: Query Processor, Statistics Reporter: Prasanth J Assignee: Prasanth J Fix For: 0.13.0 Attachments: HIVE-5849.1.patch.txt, HIVE-5849.2.patch.txt, HIVE-5849.3.patch, HIVE-5849.3.patch.txt, HIVE-5849.4.javaonly.patch, HIVE-5849.5.patch, HIVE-5849.6.patch, HIVE-5849.7.patch In the absence of any column statistics, operators will simply use the statistics from its parents. It is useful to apply some heuristics to update basic statistics (number of rows and data size) in the absence of any column statistics. This will be worst case scenario. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15816: HIVE-3181 getDatabaseMajor/Minor version does not return values
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15816/ --- Review request for hive. Bugs: HIVE-3181 https://issues.apache.org/jira/browse/HIVE-3181 Repository: hive-git Description --- This will parse the database version to determine the major and minor versions. Diffs - itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 7b1c9da jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java c447d44 jdbc/src/java/org/apache/hive/jdbc/Utils.java 4d75d98 Diff: https://reviews.apache.org/r/15816/diff/ Testing --- Thanks, Szehon Ho
[jira] [Commented] (HIVE-4518) Counter Strike: Operation Operator
[ https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830497#comment-13830497 ] Hive QA commented on HIVE-4518: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12615372/HIVE-4518.11.patch {color:green}SUCCESS:{color} +1 4679 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/404/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/404/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12615372 Counter Strike: Operation Operator -- Key: HIVE-4518 URL: https://issues.apache.org/jira/browse/HIVE-4518 Project: Hive Issue Type: Improvement Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4518.1.patch, HIVE-4518.10.patch, HIVE-4518.11.patch, HIVE-4518.2.patch, HIVE-4518.3.patch, HIVE-4518.4.patch, HIVE-4518.5.patch, HIVE-4518.6.patch.txt, HIVE-4518.7.patch, HIVE-4518.8.patch, HIVE-4518.9.patch Queries of the form: from foo insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... Generate a huge amount of counters. The reason is that task.progress is turned on for dynamic partitioning queries. The counters not only make queries slower than necessary (up to 50%) you will also eventually run out. That's because we're wrapping them in enum values to comply with hadoop 0.17. The real reason we turn task.progress on is that we need CREATED_FILES and FATAL counters to ensure dynamic partitioning queries don't go haywire. The counters have counter-intuitive names like C1 through C1000 and don't seem really useful by themselves. With hadoop 20+ you don't need to wrap the counters anymore, each operator can simply create and increment counters. That should simplify the code a lot. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3181) getDatabaseMajor/Minor version does not return values
[ https://issues.apache.org/jira/browse/HIVE-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-3181: Status: Open (was: Patch Available) getDatabaseMajor/Minor version does not return values - Key: HIVE-3181 URL: https://issues.apache.org/jira/browse/HIVE-3181 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.8.1 Reporter: N Campbell Assignee: Szehon Ho Attachments: HIVE-3181.patch This is really a sub-issue of HIVE-3174 (which is a lot of properties) but given that the driver will return databaseProductVersion it makes no sense to not have implemented these as well. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Review Request 15816: HIVE-3181 getDatabaseMajor/Minor version does not return values
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15816/ --- (Updated Nov. 23, 2013, 1:40 a.m.) Review request for hive. Changes --- Made a small optimization to lazily cache the db version number. This will decrease RPC calls to server, if more than one getXXVersion() method is called. Bugs: HIVE-3181 https://issues.apache.org/jira/browse/HIVE-3181 Repository: hive-git Description --- This will parse the database version to determine the major and minor versions. Diffs (updated) - itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 7b1c9da jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java ef39573 jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java c447d44 jdbc/src/java/org/apache/hive/jdbc/Utils.java 4d75d98 Diff: https://reviews.apache.org/r/15816/diff/ Testing --- Thanks, Szehon Ho
[jira] [Updated] (HIVE-3181) getDatabaseMajor/Minor version does not return values
[ https://issues.apache.org/jira/browse/HIVE-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-3181: Attachment: HIVE-3181.2.patch Adding a small optimization to this logic to reduce potential number of RPC calls. getDatabaseMajor/Minor version does not return values - Key: HIVE-3181 URL: https://issues.apache.org/jira/browse/HIVE-3181 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.8.1 Reporter: N Campbell Assignee: Szehon Ho Attachments: HIVE-3181.2.patch, HIVE-3181.patch This is really a sub-issue of HIVE-3174 (which is a lot of properties) but given that the driver will return databaseProductVersion it makes no sense to not have implemented these as well. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3181) getDatabaseMajor/Minor version does not return values
[ https://issues.apache.org/jira/browse/HIVE-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-3181: Status: Patch Available (was: Open) getDatabaseMajor/Minor version does not return values - Key: HIVE-3181 URL: https://issues.apache.org/jira/browse/HIVE-3181 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.8.1 Reporter: N Campbell Assignee: Szehon Ho Attachments: HIVE-3181.2.patch, HIVE-3181.patch This is really a sub-issue of HIVE-3174 (which is a lot of properties) but given that the driver will return databaseProductVersion it makes no sense to not have implemented these as well. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Review Request 15804: HIVE-5866: Hive divide operator generates wrong results in certain cases
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15804/#review29330 --- ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java https://reviews.apache.org/r/15804/#comment56517 So I think the issue here is that when (integer_prec + scale) max_precision, we prioritize keeping the scale at the expense of the integer portion of the result type. Looks like the SQL Server precision/scale rules mention that it does not let the scale eat into the integer portion of the result type - it goes the other way and will reduce the scale to allow the total precision to fit within max_precision. This might be a better rule to follow than prioritizing the scale value, at least for the purposes of determining the return type. - Jason Dere On Nov. 22, 2013, 9:42 p.m., Xuefu Zhang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15804/ --- (Updated Nov. 22, 2013, 9:42 p.m.) Review request for hive. Bugs: HIVE-5866 https://issues.apache.org/jira/browse/HIVE-5866 Repository: hive-git Description --- Fixed the problem. Added a unit test. Corrected the output of a few q tests. Diffs - ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java a1015e9 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java 0b902e9 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMinus.java 538c07e ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMod.java 472e1dd ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMultiply.java 2e8d364 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPPlus.java 35f639e ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPosMod.java 6b18303 ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFOPDivide.java 581c1a8 ql/src/test/results/clientpositive/decimal_precision.q.out 2ee3578 ql/src/test/results/clientpositive/decimal_udf.q.out ed5bc65 ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out 83787ee ql/src/test/results/clientpositive/vectorization_5.q.out 54aad90 ql/src/test/results/clientpositive/vectorization_short_regress.q.out c9296e1 Diff: https://reviews.apache.org/r/15804/diff/ Testing --- Thanks, Xuefu Zhang
[jira] [Commented] (HIVE-5866) Hive divide operator generates wrong results in certain cases
[ https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830511#comment-13830511 ] Jason Dere commented on HIVE-5866: -- Ok, I understand the issue now - the integer portion of the result type was (7,6) so only 1 integer digit, and trying to cast both operands to decimal(7,6) which would result in null for 24. Changes look good, left a comment on RB. Hive divide operator generates wrong results in certain cases - Key: HIVE-5866 URL: https://issues.apache.org/jira/browse/HIVE-5866 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5866.1.patch, HIVE-5866.2.patch, HIVE-5866.patch Current GenericUDFOPDivide seems having a bug. The following query generates NULL result. {code} hive select 4BD / 25BD from test limit 1; ... Total MapReduce CPU Time Spent: 890 msec OK NULL Time taken: 7.901 seconds, Fetched: 1 row(s) {code} The correct result should be 0.16 in this query. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5849) Improve the stats of operators based on heuristics in the absence of any column statistics
[ https://issues.apache.org/jira/browse/HIVE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830520#comment-13830520 ] Hive QA commented on HIVE-5849: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12615428/HIVE-5849.7.patch {color:green}SUCCESS:{color} +1 4680 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/406/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/406/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12615428 Improve the stats of operators based on heuristics in the absence of any column statistics -- Key: HIVE-5849 URL: https://issues.apache.org/jira/browse/HIVE-5849 Project: Hive Issue Type: Sub-task Components: Query Processor, Statistics Reporter: Prasanth J Assignee: Prasanth J Fix For: 0.13.0 Attachments: HIVE-5849.1.patch.txt, HIVE-5849.2.patch.txt, HIVE-5849.3.patch, HIVE-5849.3.patch.txt, HIVE-5849.4.javaonly.patch, HIVE-5849.5.patch, HIVE-5849.6.patch, HIVE-5849.7.patch In the absence of any column statistics, operators will simply use the statistics from its parents. It is useful to apply some heuristics to update basic statistics (number of rows and data size) in the absence of any column statistics. This will be worst case scenario. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5870) Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2
[ https://issues.apache.org/jira/browse/HIVE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830532#comment-13830532 ] Hive QA commented on HIVE-5870: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12615375/HIVE-5870.patch {color:green}SUCCESS:{color} +1 4680 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/407/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/407/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12615375 Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2 -- Key: HIVE-5870 URL: https://issues.apache.org/jira/browse/HIVE-5870 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-5870.patch TestJDBCDriver2.testNewConnectionConfiguration() attempts to start a Hiveserver2 instance in the test. This can cause issues as creating HiveServer2 needs correct environment/path. This test should be moved to TestJdbcWithMiniHS2, which uses MiniHS2. MiniHS2 is for this purpose (setting all the environment properly before starting HiveServer2 instance). -- This message was sent by Atlassian JIRA (v6.1#6144)
Hive-trunk-hadoop2 - Build # 565 - Still Failing
Changes for Build #535 [thejas] HIVE-5547 : webhcat pig job submission should ship hive tar if -usehcatalog is specified (Eugene Koifman via Thejas Nair) [thejas] HIVE-5715 : HS2 should not start a session for every command (Gunther Hagleitner via Thejas Nair) Changes for Build #536 Changes for Build #537 [brock] HIVE-5740: Tar files should extract to the directory of the same name minus tar.gz (Brock Noland reviewed by Xuefu Zhang) [brock] HIVE-5611: Add assembly (i.e.) tar creation to pom (Szehon Ho via Brock Noland) [brock] HIVE-5707: Validate values for ConfVar (Navis via Brock Noland) [brock] HIVE-5721: Incremental build is disabled by MCOMPILER-209 (Navis via Brock Noland) Changes for Build #538 [brock] HIVE-5354 - Decimal precision/scale support in ORC file (Xuefu Zhang via Brock Noland) [brock] HIVE-4523 - round() function with specified decimal places not consistent with mysql (Xuefu Zhang via Brock Noland) [thejas] HIVE-5542 : Webhcat is failing to run ddl command on a secure cluster (Sushanth Sowmyan via Thejas Nair) Changes for Build #539 [brock] HIVE-5730: Beeline throws non-terminal NPE upon starting, after mavenization (Szehon Ho reviewed by Navis) Changes for Build #540 [omalley] HIVE-5425 Provide a configuration option to control the default stripe size for ORC. (omalley reviewed by gunther) [omalley] Revert HIVE-5583 since it broke the build. [hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5355 - JDBC support for decimal precision/scale Changes for Build #541 [hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5773 - Fix build due to conflict between HIVE-5711 and HIVE-5713 [brock] HIVE-5711 - Fix eclipse:eclipse maven goal (Carl Steinbach via Brock Noland) [brock] HIVE-5752 - log4j properties appear to have been lost in maven upgrade (Sergey Shelukhin via Brock Noland) [brock] HIVE-5713 - Verify versions of libraries post maven merge (Brock Noland reviewed by Gunther Hagleitner) [brock] HIVE-5765 - Beeline throws NPE when -e option is used (Szehon Ho via Brock Noland) [xuefu] HIVE-5726: The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant (reviewed by Brock) [xuefu] HIVE-5655: Hive incorrecly handles divide-by-zero case (reviewed by Edward and Brock) [xuefu] HIVE-5191: Add char data type (Jason via Xuefu) Changes for Build #542 [brock] HIVE-5351 - Secure-Socket-Layer (SSL) support for HiveServer2 (Prasad Mujumdar via Brock Noland) Changes for Build #543 [brock] HIVE-5780 - Add the missing declaration of HIVE_CLI_SERVICE_PROTOCOL_V4 in TCLIService.thrift (Prasad Mujumdar via Brock Noland) Changes for Build #544 [gunther] HIVE-5601: NPE in ORC's PPD when using select * from table with where predicate (Prasanth J via Owen O'Malley and Gunther Hagleitner) [gunther] HIVE-5562: Provide stripe level column statistics in ORC (Patch by Prasanth J, reviewed by Owen O'Malley, committed by Gunther Hagleitner) [hashutosh] HIVE-3777 : add a property in the partition to figure out if stats are accurate (Ashutosh Chauhan via Thejas Nair) Changes for Build #545 [hashutosh] HIVE-5691 : Intermediate columns are incorrectly initialized for partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner) [hashutosh] HIVE-5779 : Subquery in where clause with distinct fails with mapjoin turned on with serialization error. (Ashutosh Chauhan via Harish Butani) [gunther] HIVE-5632 (partial): Adding test data to data/files to enable pre-commit tests to run. (Prasanth J via Gunther Hagleitner) Changes for Build #546 [cws] HIVE-5786: Remove HadoopShims methods that were needed for pre-Hadoop 0.20 (Jason Dere via cws) [thejas] HIVE-5229 : Better thread management for HiveServer2 async threads (Vaibhav Gumashta via Thejas Nair) [gunther] HIVE-5745: TestHiveLogging is failing (at least on mac) (Gunther Hagleitner, reviewed by Ashutosh Chauhan) Changes for Build #547 [hashutosh] HIVE-5699 : Add unit test for vectorized BETWEEN for timestamp inputs (Eric Hanson via Ashutosh Chauhan) [hashutosh] HIVE-5767 : in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT (Sergey Shelukhin via Ashutosh Chauhan) [hashutosh] HIVE-5657 : TopN produces incorrect results with count(distinct) (Sergey Shelukhin via Ashutosh Chauhan) Changes for Build #549 [hashutosh] HIVE-5753 : Remove collector from Operator base class (Mohammad Islam via Ashutosh Chauhan) [hashutosh] HIVE-5737 : Provide StructObjectInspector for UDTFs rather than ObjectInspect[] (Navis via Ashutosh Chauhan) [hashutosh] HIVE-5790 : maven test build failure shows wrong error message (Mohammad Islam via Ashutosh Chauhan) [hashutosh] HIVE-5722 : Skip generating vectorization code if possible (Navis via Brock
Hive-trunk-h0.21 - Build # 2466 - Still Failing
Changes for Build #2436 [thejas] HIVE-5547 : webhcat pig job submission should ship hive tar if -usehcatalog is specified (Eugene Koifman via Thejas Nair) [thejas] HIVE-5715 : HS2 should not start a session for every command (Gunther Hagleitner via Thejas Nair) Changes for Build #2437 Changes for Build #2438 [brock] HIVE-5740: Tar files should extract to the directory of the same name minus tar.gz (Brock Noland reviewed by Xuefu Zhang) [brock] HIVE-5611: Add assembly (i.e.) tar creation to pom (Szehon Ho via Brock Noland) [brock] HIVE-5707: Validate values for ConfVar (Navis via Brock Noland) [brock] HIVE-5721: Incremental build is disabled by MCOMPILER-209 (Navis via Brock Noland) Changes for Build #2439 [brock] HIVE-5354 - Decimal precision/scale support in ORC file (Xuefu Zhang via Brock Noland) [brock] HIVE-4523 - round() function with specified decimal places not consistent with mysql (Xuefu Zhang via Brock Noland) [thejas] HIVE-5542 : Webhcat is failing to run ddl command on a secure cluster (Sushanth Sowmyan via Thejas Nair) Changes for Build #2440 [brock] HIVE-5730: Beeline throws non-terminal NPE upon starting, after mavenization (Szehon Ho reviewed by Navis) Changes for Build #2441 [omalley] HIVE-5425 Provide a configuration option to control the default stripe size for ORC. (omalley reviewed by gunther) [omalley] Revert HIVE-5583 since it broke the build. [hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5355 - JDBC support for decimal precision/scale Changes for Build #2443 [brock] HIVE-5351 - Secure-Socket-Layer (SSL) support for HiveServer2 (Prasad Mujumdar via Brock Noland) [hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in vectorized mode (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5773 - Fix build due to conflict between HIVE-5711 and HIVE-5713 [brock] HIVE-5711 - Fix eclipse:eclipse maven goal (Carl Steinbach via Brock Noland) [brock] HIVE-5752 - log4j properties appear to have been lost in maven upgrade (Sergey Shelukhin via Brock Noland) [brock] HIVE-5713 - Verify versions of libraries post maven merge (Brock Noland reviewed by Gunther Hagleitner) [brock] HIVE-5765 - Beeline throws NPE when -e option is used (Szehon Ho via Brock Noland) [xuefu] HIVE-5726: The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant (reviewed by Brock) [xuefu] HIVE-5655: Hive incorrecly handles divide-by-zero case (reviewed by Edward and Brock) [xuefu] HIVE-5191: Add char data type (Jason via Xuefu) Changes for Build #2444 [brock] HIVE-5780 - Add the missing declaration of HIVE_CLI_SERVICE_PROTOCOL_V4 in TCLIService.thrift (Prasad Mujumdar via Brock Noland) Changes for Build #2445 [gunther] HIVE-5601: NPE in ORC's PPD when using select * from table with where predicate (Prasanth J via Owen O'Malley and Gunther Hagleitner) [gunther] HIVE-5562: Provide stripe level column statistics in ORC (Patch by Prasanth J, reviewed by Owen O'Malley, committed by Gunther Hagleitner) [hashutosh] HIVE-3777 : add a property in the partition to figure out if stats are accurate (Ashutosh Chauhan via Thejas Nair) Changes for Build #2446 [hashutosh] HIVE-5691 : Intermediate columns are incorrectly initialized for partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner) [hashutosh] HIVE-5779 : Subquery in where clause with distinct fails with mapjoin turned on with serialization error. (Ashutosh Chauhan via Harish Butani) [gunther] HIVE-5632 (partial): Adding test data to data/files to enable pre-commit tests to run. (Prasanth J via Gunther Hagleitner) Changes for Build #2447 [cws] HIVE-5786: Remove HadoopShims methods that were needed for pre-Hadoop 0.20 (Jason Dere via cws) [thejas] HIVE-5229 : Better thread management for HiveServer2 async threads (Vaibhav Gumashta via Thejas Nair) [gunther] HIVE-5745: TestHiveLogging is failing (at least on mac) (Gunther Hagleitner, reviewed by Ashutosh Chauhan) Changes for Build #2448 [hashutosh] HIVE-5699 : Add unit test for vectorized BETWEEN for timestamp inputs (Eric Hanson via Ashutosh Chauhan) [hashutosh] HIVE-5767 : in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT (Sergey Shelukhin via Ashutosh Chauhan) [hashutosh] HIVE-5657 : TopN produces incorrect results with count(distinct) (Sergey Shelukhin via Ashutosh Chauhan) Changes for Build #2450 [hashutosh] HIVE-5683 : JDBC support for char (Jason Dere via Xuefu Zhang) [hashutosh] HIVE-5626 : enable metastore direct SQL for drop/similar queries (Sergey Shelukhin via Ashutosh Chauhan) [hashutosh] HIVE-5700 : enforce single date format for partition column storage (Sergey Shelukhin via Ashutosh Chauhan) [hashutosh] HIVE-5753 : Remove collector from Operator base class (Mohammad Islam via Ashutosh Chauhan) [hashutosh] HIVE-5737 :
[jira] [Updated] (HIVE-5876) Split elimination in ORC breaks for partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5876: - Status: Patch Available (was: Open) Split elimination in ORC breaks for partitioned tables -- Key: HIVE-5876 URL: https://issues.apache.org/jira/browse/HIVE-5876 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5876.1.patch HIVE-5632 eliminates ORC stripes from split computation that do not satisfy SARG condition. SARG expression can also refer to partition columns. But partition column will not be contained in the column names list in ORC file. This was causing ArrayIndexOutOfBoundException in split elimination logic when used with partitioned tables. The fix is to ignore evaluation of partition column expressions in split elimination. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5839) BytesRefArrayWritable compareTo violates contract
[ https://issues.apache.org/jira/browse/HIVE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5839: -- Attachment: HIVE-5839.1.patch Patch #1 fixed the test failure and added new test case. BytesRefArrayWritable compareTo violates contract - Key: HIVE-5839 URL: https://issues.apache.org/jira/browse/HIVE-5839 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0, 0.12.0 Reporter: Ian Robertson Assignee: Xuefu Zhang Attachments: HIVE-5839.1.patch, HIVE-5839.patch, HIVE-5839.patch BytesRefArrayWritable's compareTo violates the compareTo contract from java.lang.Object. Specifically: * The implementor must ensure sgn(x.compareTo( y )) == -sgn(y.compareTo( x )) for all x and y. The compareTo implementation on BytesRefArrayWritable does a proper comparison of the sizes of the two instances. However, if the sizes are the same, it proceeds to do a check if both array's have the same constant. If not, it returns 1. This means that if x and y are two BytesRefArrayWritable instances with the same size, but different contents, then x.compareTo( y ) == 1 and y.compareTo( x ) == 1. Additionally, the comparison of contents is order agnostic. This seems wrong, since order of entries should matter. It is also very inefficient, running at O(n^2), where n is the number of entries. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15820: HIVE-5839: BytesRefArrayWritable compareTo violates contract
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15820/ --- Review request for hive. Bugs: HIVE-5839 https://issues.apache.org/jira/browse/HIVE-5839 Repository: hive-git Description --- Modified according to the contract. Diffs - ql/src/test/org/apache/hadoop/hive/ql/io/TestRCFile.java d2b06ec serde/src/java/org/apache/hadoop/hive/serde2/columnar/BytesRefArrayWritable.java 712064e serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestBytesRefArrayWritable.java PRE-CREATION Diff: https://reviews.apache.org/r/15820/diff/ Testing --- Added new test case. Fixed a old test case. Thanks, Xuefu Zhang
Re: [ANNOUNCE] New Hive Committers - Jitendra Nath Pandey and Eric Hanson
Congrats guys! On Fri, Nov 22, 2013 at 11:54 PM, Vikram Dixit vik...@hortonworks.comwrote: Congrats to both of you! On Fri, Nov 22, 2013 at 9:34 AM, Jason Dere jd...@hortonworks.com wrote: Congrats! On Nov 22, 2013, at 2:25 AM, Biswajit Nayak biswajit.na...@inmobi.com wrote: Congrats to both of you.. On Fri, Nov 22, 2013 at 1:26 PM, Lefty Leverenz leftylever...@gmail.com wrote: Congratulations, Jitendra and Eric! The more the merrier. -- Lefty On Thu, Nov 21, 2013 at 6:31 PM, Jarek Jarcec Cecho jar...@apache.org wrote: Congratulations, good job! Jarcec On Thu, Nov 21, 2013 at 03:29:07PM -0800, Carl Steinbach wrote: The Apache Hive PMC has voted to make Jitendra Nath Pandey and Eric Hanson committers on the Apache Hive project. Please join me in congratulating Jitendra and Eric! Thanks. Carl _ The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-5230) Better error reporting by async threads in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830570#comment-13830570 ] Vaibhav Gumashta commented on HIVE-5230: [~cwsteinbach] [~thejas] If there is any more feedback that you have, I can look into it. Thanks! Better error reporting by async threads in HiveServer2 -- Key: HIVE-5230 URL: https://issues.apache.org/jira/browse/HIVE-5230 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5230.1.patch, HIVE-5230.1.patch, HIVE-5230.2.patch, HIVE-5230.3.patch, HIVE-5230.4.patch, HIVE-5230.6.patch, HIVE-5230.7.patch, HIVE-5230.8.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. When a background thread gets an error, currently the client can only poll for the operation state and also the error with its stacktrace is logged. However, it will be useful to provide a richer error response like thrift API does with TStatus (which is constructed while building a Thrift response object). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5217) Add long polling to asynchronous execution in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830573#comment-13830573 ] Vaibhav Gumashta commented on HIVE-5217: [~cwsteinbach] Look forward to your feedback on this one. Thanks! Add long polling to asynchronous execution in HiveServer2 - Key: HIVE-5217 URL: https://issues.apache.org/jira/browse/HIVE-5217 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5217.2.patch, HIVE-5217.3.patch, HIVE-5217.4.patch, HIVE-5217.D12801.2.patch, HIVE-5217.D12801.3.patch, HIVE-5217.D12801.4.patch, HIVE-5217.D12801.5.patch, HIVE-5217.D12801.6.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. The client gets an operation handle which it can poll to check on the operation status. However, the polling frequency is entirely left to the client which can be resource inefficient. Long polling will solve this, by blocking the client request to check the operation status for a configurable amount of time (a new HS2 config) if the data is not available, but responding immediately if the data is available. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Review Request 15804: HIVE-5866: Hive divide operator generates wrong results in certain cases
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15804/#review29332 --- ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java https://reviews.apache.org/r/15804/#comment56518 I don't think I get you. Could you give an example about how implementation here is different from sql server's? The decimal part of a decimal number is as important as the integer part in applications where decimal type is required. Otherwise, double might be better. Thus, a decimal number of a certain decimal type needs to comply with the type's precision/scale. I don't think we should store number 456.78 to a type decimal(6,4), about which we already concluded the discussion. - Xuefu Zhang On Nov. 22, 2013, 9:42 p.m., Xuefu Zhang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15804/ --- (Updated Nov. 22, 2013, 9:42 p.m.) Review request for hive. Bugs: HIVE-5866 https://issues.apache.org/jira/browse/HIVE-5866 Repository: hive-git Description --- Fixed the problem. Added a unit test. Corrected the output of a few q tests. Diffs - ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java a1015e9 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java 0b902e9 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMinus.java 538c07e ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMod.java 472e1dd ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMultiply.java 2e8d364 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPPlus.java 35f639e ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPosMod.java 6b18303 ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFOPDivide.java 581c1a8 ql/src/test/results/clientpositive/decimal_precision.q.out 2ee3578 ql/src/test/results/clientpositive/decimal_udf.q.out ed5bc65 ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out 83787ee ql/src/test/results/clientpositive/vectorization_5.q.out 54aad90 ql/src/test/results/clientpositive/vectorization_short_regress.q.out c9296e1 Diff: https://reviews.apache.org/r/15804/diff/ Testing --- Thanks, Xuefu Zhang
[jira] [Commented] (HIVE-5614) Subquery support: allow subquery expressions in having clause
[ https://issues.apache.org/jira/browse/HIVE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830583#comment-13830583 ] Hive QA commented on HIVE-5614: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12615417/HIVE-5614.4.patch {color:green}SUCCESS:{color} +1 4684 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/410/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/410/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12615417 Subquery support: allow subquery expressions in having clause - Key: HIVE-5614 URL: https://issues.apache.org/jira/browse/HIVE-5614 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5614.1.patch, HIVE-5614.2.patch, HIVE-5614.3.patch, HIVE-5614.4.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5839) BytesRefArrayWritable compareTo violates contract
[ https://issues.apache.org/jira/browse/HIVE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830598#comment-13830598 ] Hive QA commented on HIVE-5839: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12615447/HIVE-5839.1.patch {color:green}SUCCESS:{color} +1 4681 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/411/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/411/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12615447 BytesRefArrayWritable compareTo violates contract - Key: HIVE-5839 URL: https://issues.apache.org/jira/browse/HIVE-5839 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0, 0.12.0 Reporter: Ian Robertson Assignee: Xuefu Zhang Attachments: HIVE-5839.1.patch, HIVE-5839.patch, HIVE-5839.patch BytesRefArrayWritable's compareTo violates the compareTo contract from java.lang.Object. Specifically: * The implementor must ensure sgn(x.compareTo( y )) == -sgn(y.compareTo( x )) for all x and y. The compareTo implementation on BytesRefArrayWritable does a proper comparison of the sizes of the two instances. However, if the sizes are the same, it proceeds to do a check if both array's have the same constant. If not, it returns 1. This means that if x and y are two BytesRefArrayWritable instances with the same size, but different contents, then x.compareTo( y ) == 1 and y.compareTo( x ) == 1. Additionally, the comparison of contents is order agnostic. This seems wrong, since order of entries should matter. It is also very inefficient, running at O(n^2), where n is the number of entries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-3181) getDatabaseMajor/Minor version does not return values
[ https://issues.apache.org/jira/browse/HIVE-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830610#comment-13830610 ] Hive QA commented on HIVE-3181: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12615431/HIVE-3181.2.patch {color:green}SUCCESS:{color} +1 4680 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/413/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/413/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12615431 getDatabaseMajor/Minor version does not return values - Key: HIVE-3181 URL: https://issues.apache.org/jira/browse/HIVE-3181 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.8.1 Reporter: N Campbell Assignee: Szehon Ho Attachments: HIVE-3181.2.patch, HIVE-3181.patch This is really a sub-issue of HIVE-3174 (which is a lot of properties) but given that the driver will return databaseProductVersion it makes no sense to not have implemented these as well. -- This message was sent by Atlassian JIRA (v6.1#6144)