Hive-trunk-hadoop2 - Build # 563 - Still Failing

2013-11-22 Thread Apache Jenkins Server
Changes for Build #531
[thejas] HIVE-5483 : use metastore statistics to optimize max/min/etc. queries 
(Ashutosh Chauhan via Thejas Nair)

[daijy] HIVE-5510: [WebHCat] GET job/queue return wrong job information

[brock] HIVE-5610 - Merge maven branch into trunk (delete ant)

[brock] HIVE-5610 - Merge maven branch into trunk (maven rollforward)

[brock] HIVE-5610 - Merge maven branch into trunk (patch)

[hashutosh] HIVE-5693 : Rewrite some tests to reduce test time (Navis via 
Ashutosh Chauhan)

[hashutosh] HIVE-5582 : Implement BETWEEN filter in vectorized mode (Eric 
Hanson via Ashutosh Chauhan)

[hashutosh] HIVE-5556 : Pushdown join conditions (Harish Butani via Ashutosh 
Chauhan)


Changes for Build #532
[brock] HIVE-5716 - Fix broken tests after maven merge (1) (Brock Noland 
reviewed by Thejas M Nair and Ashutosh Chauhan)


Changes for Build #533
[hashutosh] HIVE-3959 : Update Partition Statistics in Metastore Layer 
(Ashutosh Chauhan, Bhushan Mandhani, Gang Tim Liu via Thejas Nair)


Changes for Build #534
[hashutosh] HIVE-5503 : TopN optimization in VectorReduceSink (Sergey Shelukhin 
via Ashutosh Chauhan)

[brock] HIVE-5695 - PTest2 fix shutdown, duplicate runs, and add client retry

[brock] HIVE-5708 - PTest2 should trim long logs when posting to jira


Changes for Build #535
[thejas] HIVE-5547 : webhcat pig job submission should ship hive tar if 
-usehcatalog is specified (Eugene Koifman via Thejas Nair)

[thejas] HIVE-5715 : HS2 should not start a session for every command 
(Gunther Hagleitner via Thejas Nair)


Changes for Build #536

Changes for Build #537
[brock] HIVE-5740: Tar files should extract to the directory of the same name 
minus tar.gz (Brock Noland reviewed by Xuefu Zhang)

[brock] HIVE-5611: Add assembly (i.e.) tar creation to pom (Szehon Ho via Brock 
Noland)

[brock] HIVE-5707: Validate values for ConfVar (Navis via Brock Noland)

[brock] HIVE-5721: Incremental build is disabled by MCOMPILER-209 (Navis via 
Brock Noland)


Changes for Build #538
[brock] HIVE-5354 - Decimal precision/scale support in ORC file (Xuefu Zhang 
via Brock Noland)

[brock] HIVE-4523 - round() function with specified decimal places not 
consistent with mysql (Xuefu Zhang via Brock Noland)

[thejas] HIVE-5542 : Webhcat is failing to run ddl command on a secure cluster 
(Sushanth Sowmyan via Thejas Nair)


Changes for Build #539
[brock] HIVE-5730: Beeline throws non-terminal NPE upon starting, after 
mavenization (Szehon Ho reviewed by Navis)


Changes for Build #540
[omalley] HIVE-5425 Provide a configuration option to control the default stripe
size for ORC. (omalley reviewed by gunther)

[omalley] Revert HIVE-5583 since it broke the build.

[hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in 
vectorized mode (Eric Hanson via Ashutosh Chauhan)

[brock] HIVE-5355 - JDBC support for decimal precision/scale


Changes for Build #541
[hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in 
vectorized mode (Eric Hanson via Ashutosh Chauhan)

[brock] HIVE-5773 - Fix build due to conflict between HIVE-5711 and HIVE-5713

[brock] HIVE-5711 - Fix eclipse:eclipse maven goal (Carl Steinbach via Brock 
Noland)

[brock] HIVE-5752 - log4j properties appear to have been lost in maven upgrade 
(Sergey Shelukhin via Brock Noland)

[brock] HIVE-5713 - Verify versions of libraries post maven merge (Brock Noland 
reviewed by Gunther Hagleitner)

[brock] HIVE-5765 - Beeline throws NPE when -e option is used (Szehon Ho via 
Brock Noland)

[xuefu] HIVE-5726: The DecimalTypeInfo instance associated with a decimal 
constant is not in line with the precision/scale of the constant (reviewed by 
Brock)

[xuefu] HIVE-5655: Hive incorrecly handles divide-by-zero case (reviewed by 
Edward and Brock)

[xuefu] HIVE-5191: Add char data type (Jason via Xuefu)


Changes for Build #542
[brock] HIVE-5351 - Secure-Socket-Layer (SSL) support for HiveServer2 (Prasad 
Mujumdar via Brock Noland)


Changes for Build #543
[brock] HIVE-5780 - Add the missing declaration of HIVE_CLI_SERVICE_PROTOCOL_V4 
in TCLIService.thrift (Prasad Mujumdar via Brock Noland)


Changes for Build #544
[gunther] HIVE-5601: NPE in ORC's PPD when using select * from table with where 
predicate (Prasanth J via Owen O'Malley and Gunther Hagleitner)

[gunther] HIVE-5562: Provide stripe level column statistics in ORC (Patch by 
Prasanth J, reviewed by Owen O'Malley, committed by Gunther Hagleitner)

[hashutosh] HIVE-3777 : add a property in the partition to figure out if stats 
are accurate (Ashutosh Chauhan via Thejas Nair)


Changes for Build #545
[hashutosh] HIVE-5691 : Intermediate columns are incorrectly initialized for 
partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner)

[hashutosh] HIVE-5779 : Subquery in where clause with distinct fails with 
mapjoin turned on with serialization error. (Ashutosh Chauhan via Harish Butani)

[gunther] HIVE-5632 (partial): Adding test data to data/files to enable 

Hive-trunk-h0.21 - Build # 2464 - Still Failing

2013-11-22 Thread Apache Jenkins Server
Changes for Build #2434
[hashutosh] HIVE-3959 : Update Partition Statistics in Metastore Layer 
(Ashutosh Chauhan, Bhushan Mandhani, Gang Tim Liu via Thejas Nair)


Changes for Build #2435
[hashutosh] HIVE-5503 : TopN optimization in VectorReduceSink (Sergey Shelukhin 
via Ashutosh Chauhan)

[brock] HIVE-5695 - PTest2 fix shutdown, duplicate runs, and add client retry

[brock] HIVE-5708 - PTest2 should trim long logs when posting to jira


Changes for Build #2436
[thejas] HIVE-5547 : webhcat pig job submission should ship hive tar if 
-usehcatalog is specified (Eugene Koifman via Thejas Nair)

[thejas] HIVE-5715 : HS2 should not start a session for every command 
(Gunther Hagleitner via Thejas Nair)


Changes for Build #2437

Changes for Build #2438
[brock] HIVE-5740: Tar files should extract to the directory of the same name 
minus tar.gz (Brock Noland reviewed by Xuefu Zhang)

[brock] HIVE-5611: Add assembly (i.e.) tar creation to pom (Szehon Ho via Brock 
Noland)

[brock] HIVE-5707: Validate values for ConfVar (Navis via Brock Noland)

[brock] HIVE-5721: Incremental build is disabled by MCOMPILER-209 (Navis via 
Brock Noland)


Changes for Build #2439
[brock] HIVE-5354 - Decimal precision/scale support in ORC file (Xuefu Zhang 
via Brock Noland)

[brock] HIVE-4523 - round() function with specified decimal places not 
consistent with mysql (Xuefu Zhang via Brock Noland)

[thejas] HIVE-5542 : Webhcat is failing to run ddl command on a secure cluster 
(Sushanth Sowmyan via Thejas Nair)


Changes for Build #2440
[brock] HIVE-5730: Beeline throws non-terminal NPE upon starting, after 
mavenization (Szehon Ho reviewed by Navis)


Changes for Build #2441
[omalley] HIVE-5425 Provide a configuration option to control the default stripe
size for ORC. (omalley reviewed by gunther)

[omalley] Revert HIVE-5583 since it broke the build.

[hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in 
vectorized mode (Eric Hanson via Ashutosh Chauhan)

[brock] HIVE-5355 - JDBC support for decimal precision/scale


Changes for Build #2443
[brock] HIVE-5351 - Secure-Socket-Layer (SSL) support for HiveServer2 (Prasad 
Mujumdar via Brock Noland)

[hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in 
vectorized mode (Eric Hanson via Ashutosh Chauhan)

[brock] HIVE-5773 - Fix build due to conflict between HIVE-5711 and HIVE-5713

[brock] HIVE-5711 - Fix eclipse:eclipse maven goal (Carl Steinbach via Brock 
Noland)

[brock] HIVE-5752 - log4j properties appear to have been lost in maven upgrade 
(Sergey Shelukhin via Brock Noland)

[brock] HIVE-5713 - Verify versions of libraries post maven merge (Brock Noland 
reviewed by Gunther Hagleitner)

[brock] HIVE-5765 - Beeline throws NPE when -e option is used (Szehon Ho via 
Brock Noland)

[xuefu] HIVE-5726: The DecimalTypeInfo instance associated with a decimal 
constant is not in line with the precision/scale of the constant (reviewed by 
Brock)

[xuefu] HIVE-5655: Hive incorrecly handles divide-by-zero case (reviewed by 
Edward and Brock)

[xuefu] HIVE-5191: Add char data type (Jason via Xuefu)


Changes for Build #2444
[brock] HIVE-5780 - Add the missing declaration of HIVE_CLI_SERVICE_PROTOCOL_V4 
in TCLIService.thrift (Prasad Mujumdar via Brock Noland)


Changes for Build #2445
[gunther] HIVE-5601: NPE in ORC's PPD when using select * from table with where 
predicate (Prasanth J via Owen O'Malley and Gunther Hagleitner)

[gunther] HIVE-5562: Provide stripe level column statistics in ORC (Patch by 
Prasanth J, reviewed by Owen O'Malley, committed by Gunther Hagleitner)

[hashutosh] HIVE-3777 : add a property in the partition to figure out if stats 
are accurate (Ashutosh Chauhan via Thejas Nair)


Changes for Build #2446
[hashutosh] HIVE-5691 : Intermediate columns are incorrectly initialized for 
partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner)

[hashutosh] HIVE-5779 : Subquery in where clause with distinct fails with 
mapjoin turned on with serialization error. (Ashutosh Chauhan via Harish Butani)

[gunther] HIVE-5632 (partial): Adding test data to data/files to enable 
pre-commit tests to run. (Prasanth J via Gunther Hagleitner)


Changes for Build #2447
[cws] HIVE-5786: Remove HadoopShims methods that were needed for pre-Hadoop 
0.20 (Jason Dere via cws)

[thejas] HIVE-5229 : Better thread management for HiveServer2 async threads 
(Vaibhav Gumashta via Thejas Nair)

[gunther] HIVE-5745: TestHiveLogging is failing (at least on mac) (Gunther 
Hagleitner, reviewed by Ashutosh Chauhan)


Changes for Build #2448
[hashutosh] HIVE-5699 : Add unit test for vectorized BETWEEN for timestamp 
inputs (Eric Hanson via Ashutosh Chauhan)

[hashutosh] HIVE-5767 : in SemanticAnalyzer#doPhase1, handling for TOK_UNION 
falls thru into TOK_INSERT (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5657 : TopN produces incorrect results with count(distinct) 
(Sergey Shelukhin via Ashutosh Chauhan)


Changes for Build #2450

[jira] [Commented] (HIVE-5849) Improve the stats of operators based on heuristics in the absence of any column statistics

2013-11-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829832#comment-13829832
 ] 

Hive QA commented on HIVE-5849:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12615235/HIVE-5849.6.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4680 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/396/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/396/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12615235

 Improve the stats of operators based on heuristics in the absence of any 
 column statistics
 --

 Key: HIVE-5849
 URL: https://issues.apache.org/jira/browse/HIVE-5849
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor, Statistics
Reporter: Prasanth J
Assignee: Prasanth J
 Fix For: 0.13.0

 Attachments: HIVE-5849.1.patch.txt, HIVE-5849.2.patch.txt, 
 HIVE-5849.3.patch, HIVE-5849.3.patch.txt, HIVE-5849.4.javaonly.patch, 
 HIVE-5849.5.patch, HIVE-5849.6.patch


 In the absence of any column statistics, operators will simply use the 
 statistics from its parents. It is useful to apply some heuristics to update 
 basic statistics (number of rows and data size) in the absence of any column 
 statistics. This will be worst case scenario.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5827) Incorrect location of logs for failed tests.

2013-11-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829926#comment-13829926
 ] 

Hive QA commented on HIVE-5827:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12615264/HIVE-5827.2.patch

{color:green}SUCCESS:{color} +1 4680 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/397/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/397/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12615264

 Incorrect location of logs for failed tests. 
 -

 Key: HIVE-5827
 URL: https://issues.apache.org/jira/browse/HIVE-5827
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-5827.1.patch, HIVE-5827.2.patch


 Extending HIVE-5790 to fix other tests.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5871) Use multiple-characters as field delimiter

2013-11-22 Thread Rui Li (JIRA)
Rui Li created HIVE-5871:


 Summary: Use multiple-characters as field delimiter
 Key: HIVE-5871
 URL: https://issues.apache.org/jira/browse/HIVE-5871
 Project: Hive
  Issue Type: Improvement
  Components: Contrib
Affects Versions: 0.12.0
Reporter: Rui Li


Add a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe, users can 
specify a multiple-character field delimiter when creating tables.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5871) Use multiple-characters as field delimiter

2013-11-22 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-5871:
-

Attachment: HIVE-5871.patch

This implementation mainly relies on LazySimpleSerDe for serialization and 
deserialization. I added some methods to LazyStruct to parse a row delimited by 
multiple-character string. Another difference from LazySimpleSerDe is that 
MultiDelimitSerDe doesn't use Base64 to encode binary fields in serialization. 
Because the encoded string may interfere with the delimiter. I also modified 
LazyBinary, so that when it deserializes a binary field and is  unable to 
Base64 decode the field, it just keeps the data unchanged. A simple use case is 
as follow:

create table test (id string,hivearray arraybinary,hivemap mapstring,int) 
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH 
SERDEPROPERTIES 
(field.delimited=[,],collection.delimited=:,mapkey.delimited=@);

where field.delimited is the multiple-char field delimiter. 
collection.delimited is the delimiter for collection items. mapkey.delimited is 
the delimiter for  keys and values in maps. We currently don't support 
multiple-char for these two delimiters.

 Use multiple-characters as field delimiter
 --

 Key: HIVE-5871
 URL: https://issues.apache.org/jira/browse/HIVE-5871
 Project: Hive
  Issue Type: Improvement
  Components: Contrib
Affects Versions: 0.12.0
Reporter: Rui Li
 Attachments: HIVE-5871.patch


 Add a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe, users can 
 specify a multiple-character field delimiter when creating tables.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5872) Make UDAFs such as GenericUDAFSum report accurate precision/scale for decimal types

2013-11-22 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created HIVE-5872:
-

 Summary: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types
 Key: HIVE-5872
 URL: https://issues.apache.org/jira/browse/HIVE-5872
 Project: Hive
  Issue Type: Improvement
  Components: Types, UDF
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Currently UDAFs are still reporting system default precision/scale (38, 18) for 
decimal results. Not only this is coarse, but also this can cause problems in 
subsequent operators such as division, where the result is dependent on the 
precision/scale of the input, which can go out of bound (38,38). Thus, these 
UDAFs should correctly report the precision/scale of the result.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


import proto buffer file in a python transform script error(Very urgent)!

2013-11-22 Thread liu jay
Hi Dear!

  i user transform of hive to Analysis logs,and than, some fields of
logs is pb type, i done is follow:

  1)  HIVESQL is :
add file hive_shift_parse.py
  add file  locationShift_pb2.py
 select transform(log) using 'python hive_shift_parse.py' as
sm_datetime,sm_appid,sm_language,sm_iosMaxVersion,sm_iosMinVersion,sm_messageid,sm_logtype,sm_request,sm_response,sm_status,sm_responsetime,sm_ip,sm_province,sm_city,sm_town,sm_day_time
from snowman_service_raw;

 2)  import of hive_shift_parse.py  is follow:
import urllib2,sys,os,re,datetime,json,time,math
import fileinput
import base64
import locationShift_pb2
3)  locationShift_pb2.py is a pb file。
4)run error is follow:
 Traceback (most recent call last):

  File  hive_shift_parse.py.py, line 6, in ?

import locationShift_pb2
ImportError: No module named locationShift_pb2

i search result in google,but can not Solve。i guest the problem is
load of proto buffer(pb).

thanks for help.


Re: [ANNOUNCE] New Hive Committers - Jitendra Nath Pandey and Eric Hanson

2013-11-22 Thread Biswajit Nayak
Congrats to both of you..


On Fri, Nov 22, 2013 at 1:26 PM, Lefty Leverenz leftylever...@gmail.comwrote:

 Congratulations, Jitendra and Eric!  The more the merrier.

 -- Lefty


 On Thu, Nov 21, 2013 at 6:31 PM, Jarek Jarcec Cecho jar...@apache.orgwrote:

 Congratulations, good job!

 Jarcec

 On Thu, Nov 21, 2013 at 03:29:07PM -0800, Carl Steinbach wrote:
  The Apache Hive PMC has voted to make Jitendra Nath Pandey and Eric
 Hanson
  committers on the Apache Hive project.
 
  Please join me in congratulating Jitendra and Eric!
 
  Thanks.
 
  Carl




-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Re: [ANNOUNCE] New Hive Committers - Jitendra Nath Pandey and Eric Hanson

2013-11-22 Thread Jason Dere
Congrats!

On Nov 22, 2013, at 2:25 AM, Biswajit Nayak biswajit.na...@inmobi.com wrote:

 Congrats to both of you.. 
 
 
 On Fri, Nov 22, 2013 at 1:26 PM, Lefty Leverenz leftylever...@gmail.com 
 wrote:
 Congratulations, Jitendra and Eric!  The more the merrier.
 
 -- Lefty
 
 
 On Thu, Nov 21, 2013 at 6:31 PM, Jarek Jarcec Cecho jar...@apache.org wrote:
 Congratulations, good job!
 
 Jarcec
 
 On Thu, Nov 21, 2013 at 03:29:07PM -0800, Carl Steinbach wrote:
  The Apache Hive PMC has voted to make Jitendra Nath Pandey and Eric Hanson
  committers on the Apache Hive project.
 
  Please join me in congratulating Jitendra and Eric!
 
  Thanks.
 
  Carl
 
 
 
 _
 The information contained in this communication is intended solely for the 
 use of the individual or entity to whom it is addressed and others authorized 
 to receive it. It may contain confidential or legally privileged information. 
 If you are not the intended recipient you are hereby notified that any 
 disclosure, copying, distribution or taking any action in reliance on the 
 contents of this information is strictly prohibited and may be unlawful. If 
 you have received this communication in error, please notify us immediately 
 by responding to this email and then delete it from your system. The firm is 
 neither liable for the proper and complete transmission of the information 
 contained in this communication nor for any delay in its receipt.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


hcat tests with MVN

2013-11-22 Thread Eugene Koifman
Hi,
I've noticed a couple of problems.


1) running mvn tests from hcatalog/ runs the tests under core and
hcatalog-pig-adapter submodules but not any of the other modules
(webhcat/java-client, webhcat/svr, etc).  Though if I cd to the appropriate
submodule ant run mvn test the tests are run.

2) mvn surefire-report:report from hcatalog generates .html files with
test results in the ./target/site/surefire-report.html of each submodule,
but not a single .html that includes results for all tests.
 (hcatalog/target/site/surefire-report.html is generated but contains 0
tests)

Does anyone have suggestions on how to fix these?

Thanks,
Eugene

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Updated] (HIVE-5833) Remove versions from child module dependencies

2013-11-22 Thread Kousuke Saruta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated HIVE-5833:
-

Status: Patch Available  (was: Open)

 Remove versions from child module dependencies
 --

 Key: HIVE-5833
 URL: https://issues.apache.org/jira/browse/HIVE-5833
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
 Attachments: HIVE-5833.2.patch, HIVE-5833.patch


 HIVE-5741 moved all dependencies to the plugin management section of the 
 parent pom therefore we can remove 
 {noformat}version${dep.version}/version{noformat} from all dependencies 
 in child modules.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: [ANNOUNCE] New Hive Committers - Jitendra Nath Pandey and Eric Hanson

2013-11-22 Thread Vikram Dixit
Congrats to both of you!


On Fri, Nov 22, 2013 at 9:34 AM, Jason Dere jd...@hortonworks.com wrote:

 Congrats!

 On Nov 22, 2013, at 2:25 AM, Biswajit Nayak biswajit.na...@inmobi.com
 wrote:

  Congrats to both of you..
 
 
  On Fri, Nov 22, 2013 at 1:26 PM, Lefty Leverenz leftylever...@gmail.com
 wrote:
  Congratulations, Jitendra and Eric!  The more the merrier.
 
  -- Lefty
 
 
  On Thu, Nov 21, 2013 at 6:31 PM, Jarek Jarcec Cecho jar...@apache.org
 wrote:
  Congratulations, good job!
 
  Jarcec
 
  On Thu, Nov 21, 2013 at 03:29:07PM -0800, Carl Steinbach wrote:
   The Apache Hive PMC has voted to make Jitendra Nath Pandey and Eric
 Hanson
   committers on the Apache Hive project.
  
   Please join me in congratulating Jitendra and Eric!
  
   Thanks.
  
   Carl
 
 
 
  _
  The information contained in this communication is intended solely for
 the use of the individual or entity to whom it is addressed and others
 authorized to receive it. It may contain confidential or legally privileged
 information. If you are not the intended recipient you are hereby notified
 that any disclosure, copying, distribution or taking any action in reliance
 on the contents of this information is strictly prohibited and may be
 unlawful. If you have received this communication in error, please notify
 us immediately by responding to this email and then delete it from your
 system. The firm is neither liable for the proper and complete transmission
 of the information contained in this communication nor for any delay in its
 receipt.


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Review Request 15718: HIVE-5614: Subquery support: allow subquery expressions in having clause

2013-11-22 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15718/#review29298
---



ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java
https://reviews.apache.org/r/15718/#comment56454

It will be good to add a comment about various fields in Conjunct class.



ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java
https://reviews.apache.org/r/15718/#comment56455

Can this constructor be package protected instead of public ?



ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java
https://reviews.apache.org/r/15718/#comment56456

protected instead of public?



ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java
https://reviews.apache.org/r/15718/#comment56457

It will be good to add a comment, how behavior of ConjunctAnalyzer changes 
when forHavingClause = true instead of false.



ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java
https://reviews.apache.org/r/15718/#comment56458

Should this exception needs to be propagated up the stack. At the least, we 
should have LOG.warn() message here.



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
https://reviews.apache.org/r/15718/#comment56452

It will be good to add a comment here along the lines of
 there could be a subq in having clause, if so we need to generate subq 
plan followed by semi-join.



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
https://reviews.apache.org/r/15718/#comment56459

It will be good to add a comment how this new boolean changes behavior of 
this method.



ql/src/test/queries/clientpositive/subquery_in_having.q
https://reviews.apache.org/r/15718/#comment56453

It will be good to add a test which has a subq in both where clause as well 
as having clause



ql/src/test/queries/clientpositive/subquery_in_having.q
https://reviews.apache.org/r/15718/#comment56448

Same comment w.r.t map-join on. Also, if we support over clause in subq, it 
will be good to have a test for that.



ql/src/test/queries/clientpositive/subquery_notexists_having.q
https://reviews.apache.org/r/15718/#comment56449

It will be good to add a negative test where subq and outer query both uses 
same table alias. It seems in such cases we may generate incorrect results, so 
we should disable those.



ql/src/test/results/clientpositive/subquery_in_having.q.out
https://reviews.apache.org/r/15718/#comment56450

In this plan, we are first computing outq, then subq and then doing left 
semi-join on resultset of those two. As we discussed efficient way for this is 
to push filter conditions in subq to outer query to cut-down the output 
generated by outq. Though, I am not sure whether its better to do it in 
optimizer phase via Transformer or right here. Either ways, I think thats an 
optimization which we can do as a follow-up.



ql/src/test/results/clientpositive/subquery_notexists_having.q.out
https://reviews.apache.org/r/15718/#comment56451

First expression in this filter is redundant. Thats not strictly required. 
However, since there is an active work going on for constant folding 
optimization, this may get optimized way via that optimization. Either way, 
this can be done in follow-up.


- Ashutosh Chauhan


On Nov. 20, 2013, 6:04 p.m., Harish Butani wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15718/
 ---
 
 (Updated Nov. 20, 2013, 6:04 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Bugs: HIVE-5614
 https://issues.apache.org/jira/browse/HIVE-5614
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 support for subquery predicates in having clause. SubTask of HIVE-784
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java fa111cc 
   ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java 3e8215d 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7979873 
   ql/src/test/queries/clientpositive/subquery_exists_having.q PRE-CREATION 
   ql/src/test/queries/clientpositive/subquery_in_having.q PRE-CREATION 
   ql/src/test/queries/clientpositive/subquery_notexists_having.q PRE-CREATION 
   ql/src/test/queries/clientpositive/subquery_notin_having.q PRE-CREATION 
   ql/src/test/results/clientpositive/subquery_exists_having.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/subquery_in_having.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/subquery_multiinsert.q.out 8dfb485 
   ql/src/test/results/clientpositive/subquery_notexists_having.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/subquery_notin_having.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/15718/diff/
 
 
 Testing
 ---
 
 added new tests: 

[jira] [Updated] (HIVE-5614) Subquery support: allow subquery expressions in having clause

2013-11-22 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5614:
---

Status: Open  (was: Patch Available)

Some comments on RB.

 Subquery support: allow subquery expressions in having clause
 -

 Key: HIVE-5614
 URL: https://issues.apache.org/jira/browse/HIVE-5614
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5614.1.patch, HIVE-5614.2.patch, HIVE-5614.3.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-4518) Counter Strike: Operation Operator

2013-11-22 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-4518:
-

Attachment: HIVE-4518.11.patch

patch v11 - counter names don't need to be configurable. Also rebase with trunk

 Counter Strike: Operation Operator
 --

 Key: HIVE-4518
 URL: https://issues.apache.org/jira/browse/HIVE-4518
 Project: Hive
  Issue Type: Improvement
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4518.1.patch, HIVE-4518.10.patch, 
 HIVE-4518.11.patch, HIVE-4518.2.patch, HIVE-4518.3.patch, HIVE-4518.4.patch, 
 HIVE-4518.5.patch, HIVE-4518.6.patch.txt, HIVE-4518.7.patch, 
 HIVE-4518.8.patch, HIVE-4518.9.patch


 Queries of the form:
 from foo
 insert overwrite table bar partition (p) select ...
 insert overwrite table bar partition (p) select ...
 insert overwrite table bar partition (p) select ...
 Generate a huge amount of counters. The reason is that task.progress is 
 turned on for dynamic partitioning queries.
 The counters not only make queries slower than necessary (up to 50%) you will 
 also eventually run out. That's because we're wrapping them in enum values to 
 comply with hadoop 0.17.
 The real reason we turn task.progress on is that we need CREATED_FILES and 
 FATAL counters to ensure dynamic partitioning queries don't go haywire.
 The counters have counter-intuitive names like C1 through C1000 and don't 
 seem really useful by themselves.
 With hadoop 20+ you don't need to wrap the counters anymore, each operator 
 can simply create and increment counters. That should simplify the code a lot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken

2013-11-22 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830247#comment-13830247
 ] 

Remus Rusanu commented on HIVE-5817:


I think the only real problem operator is JOIN. Is not necessarily ‘one VC per 
operator’ but more like ‘one VC per query region’ where query region is defined 
by boundaries between different VS requirements (basically different result 
shapes). An operator like JOIN is one that clearly introduces a boundary, and 
the interesting part is that it needs two vectorization contexts: one for it’s 
input(s) and one for it’s output. So it would be more along the line that 
during vectorization each operator takes an VC (for its input, provided by its 
parent operator) and gives out a VC for its output, for its child operators to 
consume. Most operators would give out the same VC they get as input (ie. they 
do not change shape). And there is serialization too, which is handled 
separately (as properties added to the Map).

I'll try to come up with actual code over this week end.

 column name to index mapping in VectorizationContext is broken
 --

 Key: HIVE-5817
 URL: https://issues.apache.org/jira/browse/HIVE-5817
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Sergey Shelukhin
Assignee: Remus Rusanu
Priority: Critical
 Attachments: HIVE-5817-uniquecols.broken.patch, 
 HIVE-5817.00-broken.patch


 Columns coming from different operators may have the same internal names 
 (_colNN). There exists a query in the form {{select b.cb, a.ca from a JOIN 
 b ON ... JOIN x ON ...;}}  (distilled from a more complex query), which runs 
 ok w/o vectorization. With vectorization, it will run ok for most ca, but for 
 some ca it will fail (or can probably return incorrect results). That is 
 because when building column-to-VRG-index map in VectorizationContext, 
 internal column name for ca that the first map join operator adds to the 
 mapping may be the same as internal name for cb that the 2nd one tries to 
 add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to 
 output stuff, it retrieves wrong index from the map by name, and then wrong 
 vector from VRG.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HIVE-5817) column name to index mapping in VectorizationContext is broken

2013-11-22 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu reassigned HIVE-5817:
--

Assignee: Remus Rusanu  (was: Sergey Shelukhin)

 column name to index mapping in VectorizationContext is broken
 --

 Key: HIVE-5817
 URL: https://issues.apache.org/jira/browse/HIVE-5817
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Sergey Shelukhin
Assignee: Remus Rusanu
Priority: Critical
 Attachments: HIVE-5817-uniquecols.broken.patch, 
 HIVE-5817.00-broken.patch


 Columns coming from different operators may have the same internal names 
 (_colNN). There exists a query in the form {{select b.cb, a.ca from a JOIN 
 b ON ... JOIN x ON ...;}}  (distilled from a more complex query), which runs 
 ok w/o vectorization. With vectorization, it will run ok for most ca, but for 
 some ca it will fail (or can probably return incorrect results). That is 
 because when building column-to-VRG-index map in VectorizationContext, 
 internal column name for ca that the first map join operator adds to the 
 mapping may be the same as internal name for cb that the 2nd one tries to 
 add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to 
 output stuff, it retrieves wrong index from the map by name, and then wrong 
 vector from VRG.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: hcat tests with MVN

2013-11-22 Thread Eugene Koifman
To answer the 2nd question:

mvn surefire-report:report -Daggregate=true

(or report-only if tests already ran)



On Fri, Nov 22, 2013 at 10:12 AM, Eugene Koifman
ekoif...@hortonworks.comwrote:

 Hi,
 I've noticed a couple of problems.


 1) running mvn tests from hcatalog/ runs the tests under core and
 hcatalog-pig-adapter submodules but not any of the other modules
 (webhcat/java-client, webhcat/svr, etc).  Though if I cd to the appropriate
 submodule ant run mvn test the tests are run.

 2) mvn surefire-report:report from hcatalog generates .html files with
 test results in the ./target/site/surefire-report.html of each submodule,
 but not a single .html that includes results for all tests.
  (hcatalog/target/site/surefire-report.html is generated but contains 0
 tests)

 Does anyone have suggestions on how to fix these?

 Thanks,
 Eugene



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Updated] (HIVE-5870) Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2

2013-11-22 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-5870:


Attachment: HIVE-5870.patch

Attaching the patch.

 Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2
 --

 Key: HIVE-5870
 URL: https://issues.apache.org/jira/browse/HIVE-5870
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-5870.patch


 TestJDBCDriver2.testNewConnectionConfiguration() attempts to start a 
 Hiveserver2 instance in the test.
 This can cause issues as creating HiveServer2 needs correct environment/path. 
  This test should be moved to TestJdbcWithMiniHS2, which uses MiniHS2.  
 MiniHS2 is for this purpose (setting all the environment properly before 
 starting HiveServer2 instance).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 15797: HIVE-5870 - Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2

2013-11-22 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15797/
---

Review request for hive.


Bugs: HIVE-5870
https://issues.apache.org/jira/browse/HIVE-5870


Repository: hive-git


Description
---

TestJDBCDriver2.testNewConnectionConfiguration() attempts to start a 
Hiveserver2 instance and test connect to it.  This can cause issues as creating 
HiveServer2 needs correct environment.  This test should be moved to 
TestJdbcWithMiniHS2, which uses MiniHS2.  MiniHS2 is for this purpose, as it 
sets all the environment properly before starting HiveServer2 instance.

This test now runs the same commands against the MiniHS2.  In the course of 
refactoring, also changed TestJdbcWithMiniHS2's MiniHS2 creation from @Before 
to @BeforeClass (ie, once per test), as calling init() multiple times on the 
HiveMetastore causes strange errors from DataNucleus/embedded Derby.  Also, it 
is more efficient.


Diffs
-

  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
7b1c9da 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
6c25736 

Diff: https://reviews.apache.org/r/15797/diff/


Testing
---

Ran affected unit tests.


Thanks,

Szehon Ho



[jira] [Updated] (HIVE-5849) Improve the stats of operators based on heuristics in the absence of any column statistics

2013-11-22 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-5849:
-

Attachment: HIVE-5849.7.patch

removed changes to dynamic_partition_skip_default.q which caused the test 
failure.

 Improve the stats of operators based on heuristics in the absence of any 
 column statistics
 --

 Key: HIVE-5849
 URL: https://issues.apache.org/jira/browse/HIVE-5849
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor, Statistics
Reporter: Prasanth J
Assignee: Prasanth J
 Fix For: 0.13.0

 Attachments: HIVE-5849.1.patch.txt, HIVE-5849.2.patch.txt, 
 HIVE-5849.3.patch, HIVE-5849.3.patch.txt, HIVE-5849.4.javaonly.patch, 
 HIVE-5849.5.patch, HIVE-5849.6.patch, HIVE-5849.7.patch


 In the absence of any column statistics, operators will simply use the 
 statistics from its parents. It is useful to apply some heuristics to update 
 basic statistics (number of rows and data size) in the absence of any column 
 statistics. This will be worst case scenario.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5870) Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2

2013-11-22 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-5870:


Affects Version/s: 0.13.0
   Status: Patch Available  (was: Open)

 Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2
 --

 Key: HIVE-5870
 URL: https://issues.apache.org/jira/browse/HIVE-5870
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-5870.patch


 TestJDBCDriver2.testNewConnectionConfiguration() attempts to start a 
 Hiveserver2 instance in the test.
 This can cause issues as creating HiveServer2 needs correct environment/path. 
  This test should be moved to TestJdbcWithMiniHS2, which uses MiniHS2.  
 MiniHS2 is for this purpose (setting all the environment properly before 
 starting HiveServer2 instance).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HIVE-3181) getDatabaseMajor/Minor version does not return values

2013-11-22 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-3181:
---

Assignee: Szehon Ho

 getDatabaseMajor/Minor version does not return values
 -

 Key: HIVE-3181
 URL: https://issues.apache.org/jira/browse/HIVE-3181
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Reporter: N Campbell
Assignee: Szehon Ho
 Fix For: 0.8.1


 This is really a sub-issue of HIVE-3174 (which is a lot of properties) but 
 given that the driver will return databaseProductVersion it makes no sense to 
 not have implemented these as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5866) Hive divide operator generates wrong results in certain cases

2013-11-22 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-5866:
--

Attachment: HIVE-5866.1.patch

 Hive divide operator generates wrong results in certain cases
 -

 Key: HIVE-5866
 URL: https://issues.apache.org/jira/browse/HIVE-5866
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-5866.1.patch, HIVE-5866.patch


 Current GenericUDFOPDivide seems having a bug. The following query generates 
 NULL result.
 {code}
 hive select 4BD / 25BD  from test limit 1;
 ...
 Total MapReduce CPU Time Spent: 890 msec
 OK
 NULL
 Time taken: 7.901 seconds, Fetched: 1 row(s)
 {code}
 The correct result should be 0.16 in this query.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5866) Hive divide operator generates wrong results in certain cases

2013-11-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830292#comment-13830292
 ] 

Hive QA commented on HIVE-5866:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12615378/HIVE-5866.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 4681 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_short_regress
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/400/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/400/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12615378

 Hive divide operator generates wrong results in certain cases
 -

 Key: HIVE-5866
 URL: https://issues.apache.org/jira/browse/HIVE-5866
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-5866.1.patch, HIVE-5866.patch


 Current GenericUDFOPDivide seems having a bug. The following query generates 
 NULL result.
 {code}
 hive select 4BD / 25BD  from test limit 1;
 ...
 Total MapReduce CPU Time Spent: 890 msec
 OK
 NULL
 Time taken: 7.901 seconds, Fetched: 1 row(s)
 {code}
 The correct result should be 0.16 in this query.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5873) SubQuery: In subquery Count Bug

2013-11-22 Thread Harish Butani (JIRA)
Harish Butani created HIVE-5873:
---

 Summary: SubQuery: In subquery Count Bug
 Key: HIVE-5873
 URL: https://issues.apache.org/jira/browse/HIVE-5873
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Harish Butani


This is from the Optimization of Nested SQl Queries Revisited paper: 
http://dl.acm.org/citation.cfm?id=38723

Consider Part table having:
{noformat}
PNum OrderOnHand
 --
3  6
101
8  0
{noformat}

Supply table having:
{noformat}
PNum  Qty  
3  4
3  2
101
{noformat}

The query:
{noformat}
select pnum
from parts p
where orderOnHand
 in (select count(*) from supply s
  where s.pnum = p.pnum
 )
{noformat}

should return the row with PNum=8.
But a transformation to a semi-join would eliminate this row, as there are no 
rows in supply table with PNum=8.

AS shown in the paper the soln is to transform to:
{noformat}
select pnum
from parts p semijoin
(select p1.pnum, count(*) as c
  from (select distinct pnum from parts) p1 join supply s
  where s.pnum = p1.pnum
 ) sq on p.pnum = sq.pnum and p.orderOnHand = sq.c
{noformat}

The additional distinct query within the SubQuery is to handle duplicates in 
the outer query on the joining columns.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Hive-trunk-hadoop2 - Build # 564 - Still Failing

2013-11-22 Thread Apache Jenkins Server
Changes for Build #533
[hashutosh] HIVE-3959 : Update Partition Statistics in Metastore Layer 
(Ashutosh Chauhan, Bhushan Mandhani, Gang Tim Liu via Thejas Nair)


Changes for Build #534
[hashutosh] HIVE-5503 : TopN optimization in VectorReduceSink (Sergey Shelukhin 
via Ashutosh Chauhan)

[brock] HIVE-5695 - PTest2 fix shutdown, duplicate runs, and add client retry

[brock] HIVE-5708 - PTest2 should trim long logs when posting to jira


Changes for Build #535
[thejas] HIVE-5547 : webhcat pig job submission should ship hive tar if 
-usehcatalog is specified (Eugene Koifman via Thejas Nair)

[thejas] HIVE-5715 : HS2 should not start a session for every command 
(Gunther Hagleitner via Thejas Nair)


Changes for Build #536

Changes for Build #537
[brock] HIVE-5740: Tar files should extract to the directory of the same name 
minus tar.gz (Brock Noland reviewed by Xuefu Zhang)

[brock] HIVE-5611: Add assembly (i.e.) tar creation to pom (Szehon Ho via Brock 
Noland)

[brock] HIVE-5707: Validate values for ConfVar (Navis via Brock Noland)

[brock] HIVE-5721: Incremental build is disabled by MCOMPILER-209 (Navis via 
Brock Noland)


Changes for Build #538
[brock] HIVE-5354 - Decimal precision/scale support in ORC file (Xuefu Zhang 
via Brock Noland)

[brock] HIVE-4523 - round() function with specified decimal places not 
consistent with mysql (Xuefu Zhang via Brock Noland)

[thejas] HIVE-5542 : Webhcat is failing to run ddl command on a secure cluster 
(Sushanth Sowmyan via Thejas Nair)


Changes for Build #539
[brock] HIVE-5730: Beeline throws non-terminal NPE upon starting, after 
mavenization (Szehon Ho reviewed by Navis)


Changes for Build #540
[omalley] HIVE-5425 Provide a configuration option to control the default stripe
size for ORC. (omalley reviewed by gunther)

[omalley] Revert HIVE-5583 since it broke the build.

[hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in 
vectorized mode (Eric Hanson via Ashutosh Chauhan)

[brock] HIVE-5355 - JDBC support for decimal precision/scale


Changes for Build #541
[hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in 
vectorized mode (Eric Hanson via Ashutosh Chauhan)

[brock] HIVE-5773 - Fix build due to conflict between HIVE-5711 and HIVE-5713

[brock] HIVE-5711 - Fix eclipse:eclipse maven goal (Carl Steinbach via Brock 
Noland)

[brock] HIVE-5752 - log4j properties appear to have been lost in maven upgrade 
(Sergey Shelukhin via Brock Noland)

[brock] HIVE-5713 - Verify versions of libraries post maven merge (Brock Noland 
reviewed by Gunther Hagleitner)

[brock] HIVE-5765 - Beeline throws NPE when -e option is used (Szehon Ho via 
Brock Noland)

[xuefu] HIVE-5726: The DecimalTypeInfo instance associated with a decimal 
constant is not in line with the precision/scale of the constant (reviewed by 
Brock)

[xuefu] HIVE-5655: Hive incorrecly handles divide-by-zero case (reviewed by 
Edward and Brock)

[xuefu] HIVE-5191: Add char data type (Jason via Xuefu)


Changes for Build #542
[brock] HIVE-5351 - Secure-Socket-Layer (SSL) support for HiveServer2 (Prasad 
Mujumdar via Brock Noland)


Changes for Build #543
[brock] HIVE-5780 - Add the missing declaration of HIVE_CLI_SERVICE_PROTOCOL_V4 
in TCLIService.thrift (Prasad Mujumdar via Brock Noland)


Changes for Build #544
[gunther] HIVE-5601: NPE in ORC's PPD when using select * from table with where 
predicate (Prasanth J via Owen O'Malley and Gunther Hagleitner)

[gunther] HIVE-5562: Provide stripe level column statistics in ORC (Patch by 
Prasanth J, reviewed by Owen O'Malley, committed by Gunther Hagleitner)

[hashutosh] HIVE-3777 : add a property in the partition to figure out if stats 
are accurate (Ashutosh Chauhan via Thejas Nair)


Changes for Build #545
[hashutosh] HIVE-5691 : Intermediate columns are incorrectly initialized for 
partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner)

[hashutosh] HIVE-5779 : Subquery in where clause with distinct fails with 
mapjoin turned on with serialization error. (Ashutosh Chauhan via Harish Butani)

[gunther] HIVE-5632 (partial): Adding test data to data/files to enable 
pre-commit tests to run. (Prasanth J via Gunther Hagleitner)


Changes for Build #546
[cws] HIVE-5786: Remove HadoopShims methods that were needed for pre-Hadoop 
0.20 (Jason Dere via cws)

[thejas] HIVE-5229 : Better thread management for HiveServer2 async threads 
(Vaibhav Gumashta via Thejas Nair)

[gunther] HIVE-5745: TestHiveLogging is failing (at least on mac) (Gunther 
Hagleitner, reviewed by Ashutosh Chauhan)


Changes for Build #547
[hashutosh] HIVE-5699 : Add unit test for vectorized BETWEEN for timestamp 
inputs (Eric Hanson via Ashutosh Chauhan)

[hashutosh] HIVE-5767 : in SemanticAnalyzer#doPhase1, handling for TOK_UNION 
falls thru into TOK_INSERT (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5657 : TopN produces incorrect results with count(distinct) 
(Sergey Shelukhin via Ashutosh Chauhan)


Changes for Build 

[jira] [Assigned] (HIVE-5758) Implement vectorized support for NOT IN filter

2013-11-22 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson reassigned HIVE-5758:
-

Assignee: Eric Hanson

 Implement vectorized support for NOT IN filter
 --

 Key: HIVE-5758
 URL: https://issues.apache.org/jira/browse/HIVE-5758
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson

 Implement full, end-to-end support for NOT IN in vectorized mode, including 
 new VectorExpression class(es), VectorizationContext translation to a 
 VectorExpression, and unit tests for these, as well as end-to-end ad hoc 
 testing. An end-to-end .q test is recommended.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Hive-trunk-h0.21 - Build # 2465 - Still Failing

2013-11-22 Thread Apache Jenkins Server
Changes for Build #2434
[hashutosh] HIVE-3959 : Update Partition Statistics in Metastore Layer 
(Ashutosh Chauhan, Bhushan Mandhani, Gang Tim Liu via Thejas Nair)


Changes for Build #2435
[hashutosh] HIVE-5503 : TopN optimization in VectorReduceSink (Sergey Shelukhin 
via Ashutosh Chauhan)

[brock] HIVE-5695 - PTest2 fix shutdown, duplicate runs, and add client retry

[brock] HIVE-5708 - PTest2 should trim long logs when posting to jira


Changes for Build #2436
[thejas] HIVE-5547 : webhcat pig job submission should ship hive tar if 
-usehcatalog is specified (Eugene Koifman via Thejas Nair)

[thejas] HIVE-5715 : HS2 should not start a session for every command 
(Gunther Hagleitner via Thejas Nair)


Changes for Build #2437

Changes for Build #2438
[brock] HIVE-5740: Tar files should extract to the directory of the same name 
minus tar.gz (Brock Noland reviewed by Xuefu Zhang)

[brock] HIVE-5611: Add assembly (i.e.) tar creation to pom (Szehon Ho via Brock 
Noland)

[brock] HIVE-5707: Validate values for ConfVar (Navis via Brock Noland)

[brock] HIVE-5721: Incremental build is disabled by MCOMPILER-209 (Navis via 
Brock Noland)


Changes for Build #2439
[brock] HIVE-5354 - Decimal precision/scale support in ORC file (Xuefu Zhang 
via Brock Noland)

[brock] HIVE-4523 - round() function with specified decimal places not 
consistent with mysql (Xuefu Zhang via Brock Noland)

[thejas] HIVE-5542 : Webhcat is failing to run ddl command on a secure cluster 
(Sushanth Sowmyan via Thejas Nair)


Changes for Build #2440
[brock] HIVE-5730: Beeline throws non-terminal NPE upon starting, after 
mavenization (Szehon Ho reviewed by Navis)


Changes for Build #2441
[omalley] HIVE-5425 Provide a configuration option to control the default stripe
size for ORC. (omalley reviewed by gunther)

[omalley] Revert HIVE-5583 since it broke the build.

[hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in 
vectorized mode (Eric Hanson via Ashutosh Chauhan)

[brock] HIVE-5355 - JDBC support for decimal precision/scale


Changes for Build #2443
[brock] HIVE-5351 - Secure-Socket-Layer (SSL) support for HiveServer2 (Prasad 
Mujumdar via Brock Noland)

[hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in 
vectorized mode (Eric Hanson via Ashutosh Chauhan)

[brock] HIVE-5773 - Fix build due to conflict between HIVE-5711 and HIVE-5713

[brock] HIVE-5711 - Fix eclipse:eclipse maven goal (Carl Steinbach via Brock 
Noland)

[brock] HIVE-5752 - log4j properties appear to have been lost in maven upgrade 
(Sergey Shelukhin via Brock Noland)

[brock] HIVE-5713 - Verify versions of libraries post maven merge (Brock Noland 
reviewed by Gunther Hagleitner)

[brock] HIVE-5765 - Beeline throws NPE when -e option is used (Szehon Ho via 
Brock Noland)

[xuefu] HIVE-5726: The DecimalTypeInfo instance associated with a decimal 
constant is not in line with the precision/scale of the constant (reviewed by 
Brock)

[xuefu] HIVE-5655: Hive incorrecly handles divide-by-zero case (reviewed by 
Edward and Brock)

[xuefu] HIVE-5191: Add char data type (Jason via Xuefu)


Changes for Build #2444
[brock] HIVE-5780 - Add the missing declaration of HIVE_CLI_SERVICE_PROTOCOL_V4 
in TCLIService.thrift (Prasad Mujumdar via Brock Noland)


Changes for Build #2445
[gunther] HIVE-5601: NPE in ORC's PPD when using select * from table with where 
predicate (Prasanth J via Owen O'Malley and Gunther Hagleitner)

[gunther] HIVE-5562: Provide stripe level column statistics in ORC (Patch by 
Prasanth J, reviewed by Owen O'Malley, committed by Gunther Hagleitner)

[hashutosh] HIVE-3777 : add a property in the partition to figure out if stats 
are accurate (Ashutosh Chauhan via Thejas Nair)


Changes for Build #2446
[hashutosh] HIVE-5691 : Intermediate columns are incorrectly initialized for 
partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner)

[hashutosh] HIVE-5779 : Subquery in where clause with distinct fails with 
mapjoin turned on with serialization error. (Ashutosh Chauhan via Harish Butani)

[gunther] HIVE-5632 (partial): Adding test data to data/files to enable 
pre-commit tests to run. (Prasanth J via Gunther Hagleitner)


Changes for Build #2447
[cws] HIVE-5786: Remove HadoopShims methods that were needed for pre-Hadoop 
0.20 (Jason Dere via cws)

[thejas] HIVE-5229 : Better thread management for HiveServer2 async threads 
(Vaibhav Gumashta via Thejas Nair)

[gunther] HIVE-5745: TestHiveLogging is failing (at least on mac) (Gunther 
Hagleitner, reviewed by Ashutosh Chauhan)


Changes for Build #2448
[hashutosh] HIVE-5699 : Add unit test for vectorized BETWEEN for timestamp 
inputs (Eric Hanson via Ashutosh Chauhan)

[hashutosh] HIVE-5767 : in SemanticAnalyzer#doPhase1, handling for TOK_UNION 
falls thru into TOK_INSERT (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5657 : TopN produces incorrect results with count(distinct) 
(Sergey Shelukhin via Ashutosh Chauhan)


Changes for Build #2450

[jira] [Updated] (HIVE-5866) Hive divide operator generates wrong results in certain cases

2013-11-22 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-5866:
--

Attachment: HIVE-5866.2.patch

Patch #2 fixed the failed test cases.

 Hive divide operator generates wrong results in certain cases
 -

 Key: HIVE-5866
 URL: https://issues.apache.org/jira/browse/HIVE-5866
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-5866.1.patch, HIVE-5866.2.patch, HIVE-5866.patch


 Current GenericUDFOPDivide seems having a bug. The following query generates 
 NULL result.
 {code}
 hive select 4BD / 25BD  from test limit 1;
 ...
 Total MapReduce CPU Time Spent: 890 msec
 OK
 NULL
 Time taken: 7.901 seconds, Fetched: 1 row(s)
 {code}
 The correct result should be 0.16 in this query.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 15804: HIVE-5866: Hive divide operator generates wrong results in certain cases

2013-11-22 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15804/
---

Review request for hive.


Bugs: HIVE-5866
https://issues.apache.org/jira/browse/HIVE-5866


Repository: hive-git


Description
---

Fixed the problem. Added a unit test. Corrected the output of a few q tests.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java 
a1015e9 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java 
0b902e9 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMinus.java 
538c07e 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMod.java 
472e1dd 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMultiply.java 
2e8d364 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPPlus.java 
35f639e 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPosMod.java 
6b18303 
  ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFOPDivide.java 
581c1a8 
  ql/src/test/results/clientpositive/decimal_precision.q.out 2ee3578 
  ql/src/test/results/clientpositive/decimal_udf.q.out ed5bc65 
  ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out 83787ee 
  ql/src/test/results/clientpositive/vectorization_5.q.out 54aad90 
  ql/src/test/results/clientpositive/vectorization_short_regress.q.out c9296e1 

Diff: https://reviews.apache.org/r/15804/diff/


Testing
---


Thanks,

Xuefu Zhang



[jira] [Commented] (HIVE-5866) Hive divide operator generates wrong results in certain cases

2013-11-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830342#comment-13830342
 ] 

Xuefu Zhang commented on HIVE-5866:
---

An issue regarding UDAFs was identified and JIRA HIVE-5872 is logged.

 Hive divide operator generates wrong results in certain cases
 -

 Key: HIVE-5866
 URL: https://issues.apache.org/jira/browse/HIVE-5866
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-5866.1.patch, HIVE-5866.2.patch, HIVE-5866.patch


 Current GenericUDFOPDivide seems having a bug. The following query generates 
 NULL result.
 {code}
 hive select 4BD / 25BD  from test limit 1;
 ...
 Total MapReduce CPU Time Spent: 890 msec
 OK
 NULL
 Time taken: 7.901 seconds, Fetched: 1 row(s)
 {code}
 The correct result should be 0.16 in this query.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4485) beeline prints null as empty strings

2013-11-22 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830354#comment-13830354
 ] 

Thejas M Nair commented on HIVE-4485:
-

bq. Hive 0.11.0 was released with the current behavior (nulls printed as NULL)
[~cwsteinbach] I want to clarify that this is about beeline and beeline 
currently prints nulls as emtpy strings. I agree that switching this would be a 
backward incompatible change. But I think it is important to distinguish 
between empty strings and null for obvious reasons.

I think this is a general problem - Hive has undesirable defaults in some cases 
and changing those would break backward compatibility. I think we should give 
the users (specially new users), the option of using the more sensible but 
backward incompatible configuration defaults. I will open a new jira for that.


 beeline prints null as empty strings
 

 Key: HIVE-4485
 URL: https://issues.apache.org/jira/browse/HIVE-4485
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4485.1.patch, HIVE-4485.2.patch


  beeline is printing nulls as emtpy strings. 
 This is inconsistent with hive cli and other databases, they print null as 
 NULL string.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5810) create a function add_date as exists in mysql

2013-11-22 Thread Anandha L Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830355#comment-13830355
 ] 

Anandha L Ranganathan commented on HIVE-5810:
-

looking into that.

 create a function add_date   as exists in mysql 
 

 Key: HIVE-5810
 URL: https://issues.apache.org/jira/browse/HIVE-5810
 Project: Hive
  Issue Type: Improvement
Reporter: Anandha L Ranganathan
Assignee: Anandha L Ranganathan
 Attachments: HIVE-5810.patch

   Original Estimate: 40h
  Remaining Estimate: 40h

 MySQL has ADDDATE(date,INTERVAL expr unit).
 Similarly in Hive we can have  (date,unit,expr). 
 Here Unit is DAY/Month/Year
 For example,
 add_date('2013-11-09','DAY',2) will return 2013-11-11.
 add_date('2013-11-09','Month',2) will return 2014-01-09.
 add_date('2013-11-09','Year',2) will return 2014-11-11.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-3815) hive table rename fails if filesystem cache is disabled

2013-11-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-3815:


   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Thanks for the review Navis.
Patch committed to trunk.


 hive table rename fails if filesystem cache is disabled
 ---

 Key: HIVE-3815
 URL: https://issues.apache.org/jira/browse/HIVE-3815
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-3815.1.patch


 If fs.filesyste.impl.disable.cache  (eg fs.hdfs.impl.disable.cache) is set 
 to true, then table rename fails.
 The exception that gets thrown (though not logged!) is 
 {quote}
 Caused by: InvalidOperationException(message:table new location 
 hdfs://host1:8020/apps/hive/warehouse/t2 is on a different file system than 
 the old location hdfs://host1:8020/apps/hive/warehouse/t1. This operation is 
 not supported)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_result$alter_table_resultStandardScheme.read(ThriftHiveMetastore.java:28825)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_result$alter_table_resultStandardScheme.read(ThriftHiveMetastore.java:28811)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_result.read(ThriftHiveMetastore.java:28753)
 at 
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table(ThriftHiveMetastore.java:977)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table(ThriftHiveMetastore.java:962)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:208)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
 at $Proxy7.alter_table(Unknown Source)
 at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:373)
 ... 18 more
 {quote}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5874) SubQuery: better error handling when SQ and Outer Query has the same table alias

2013-11-22 Thread Harish Butani (JIRA)
Harish Butani created HIVE-5874:
---

 Summary: SubQuery: better error handling when SQ and Outer Query 
has the same table alias
 Key: HIVE-5874
 URL: https://issues.apache.org/jira/browse/HIVE-5874
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Priority: Minor


The following query
{noformat}
select * from src where key in (select key from src where src.key  '1')
{noformat}

Gives the following message:
{noformat}
emanticException [Error 10249]: Line 1:58 Unsupported SubQuery Expression 
''1'': SubQuery expression refers to Outer query expressions only.
{noformat}

Whereas the user is attempting to express an uncorrelated Subquery.
The ambiguity is because we attempt to resolve references against  the Outer 
Query first. This is an implementation detail, see the Sub Query spec for 
details. For now it is better to disallow such SubQueries.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5839) BytesRefArrayWritable compareTo violates contract

2013-11-22 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-5839:
--

Attachment: HIVE-5839.patch

Reloaded the same patch to rerun the test.

 BytesRefArrayWritable compareTo violates contract
 -

 Key: HIVE-5839
 URL: https://issues.apache.org/jira/browse/HIVE-5839
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0, 0.12.0
Reporter: Ian Robertson
Assignee: Xuefu Zhang
 Attachments: HIVE-5839.patch, HIVE-5839.patch


 BytesRefArrayWritable's compareTo violates the compareTo contract from 
 java.lang.Object. Specifically:
 * The implementor must ensure sgn(x.compareTo( y )) == -sgn(y.compareTo( x )) 
 for all x and y.
 The compareTo implementation on BytesRefArrayWritable does a proper 
 comparison of the sizes of the two instances. However, if the sizes are the 
 same, it proceeds to do a check if both array's have the same constant. If 
 not, it returns 1. This means that if x and y are two BytesRefArrayWritable 
 instances with the same size, but different contents, then x.compareTo( y ) 
 == 1 and y.compareTo( x ) == 1.
 Additionally, the comparison of contents is order agnostic. This seems wrong, 
 since order of entries should matter. It is also very inefficient, running at 
 O(n^2), where n is the number of entries.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5866) Hive divide operator generates wrong results in certain cases

2013-11-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830376#comment-13830376
 ] 

Hive QA commented on HIVE-5866:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12615390/HIVE-5866.2.patch

{color:green}SUCCESS:{color} +1 4681 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/401/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/401/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12615390

 Hive divide operator generates wrong results in certain cases
 -

 Key: HIVE-5866
 URL: https://issues.apache.org/jira/browse/HIVE-5866
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-5866.1.patch, HIVE-5866.2.patch, HIVE-5866.patch


 Current GenericUDFOPDivide seems having a bug. The following query generates 
 NULL result.
 {code}
 hive select 4BD / 25BD  from test limit 1;
 ...
 Total MapReduce CPU Time Spent: 890 msec
 OK
 NULL
 Time taken: 7.901 seconds, Fetched: 1 row(s)
 {code}
 The correct result should be 0.16 in this query.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5866) Hive divide operator generates wrong results in certain cases

2013-11-22 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830383#comment-13830383
 ] 

Jason Dere commented on HIVE-5866:
--

What exactly was causing the null result in this case? I'll try to take a look 
at the patch in a bit.

 Hive divide operator generates wrong results in certain cases
 -

 Key: HIVE-5866
 URL: https://issues.apache.org/jira/browse/HIVE-5866
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-5866.1.patch, HIVE-5866.2.patch, HIVE-5866.patch


 Current GenericUDFOPDivide seems having a bug. The following query generates 
 NULL result.
 {code}
 hive select 4BD / 25BD  from test limit 1;
 ...
 Total MapReduce CPU Time Spent: 890 msec
 OK
 NULL
 Time taken: 7.901 seconds, Fetched: 1 row(s)
 {code}
 The correct result should be 0.16 in this query.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5875) task : collect list of hive configuration params whose default should change

2013-11-22 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-5875:
---

 Summary: task : collect list of hive configuration params whose 
default should change
 Key: HIVE-5875
 URL: https://issues.apache.org/jira/browse/HIVE-5875
 Project: Hive
  Issue Type: Task
Reporter: Thejas M Nair
Assignee: Thejas M Nair


The immediate motivation for this was the ticket HIVE-4485 . Beeline prints 
NULLs as empty strings. This is not a desirable behavior. But if we fix it, it 
breaks backward compatibility. 
But we should not be burdening all users with mistakes of the past, specially 
the users who are new to hive. As hadoop and hive adoption increases proportion 
of 'new' users will continue to increase.

We need a way to let users choose between backward compatible behavior and more 
sensible behavior.  How this is implemented can be discussed in a separate 
jira. 

The purpose of this *Task* jira is just to collect list of config flags whose 
current default is not the desirable one.




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5875) task : collect list of hive configuration params whose default should change

2013-11-22 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830393#comment-13830393
 ] 

Edward Capriolo commented on HIVE-5875:
---

hive.mapred.mode=strict
hive.cli.print.header=true
auotcreate.scheam=false


 task : collect list of hive configuration params whose default should change
 

 Key: HIVE-5875
 URL: https://issues.apache.org/jira/browse/HIVE-5875
 Project: Hive
  Issue Type: Task
Reporter: Thejas M Nair
Assignee: Thejas M Nair

 The immediate motivation for this was the ticket HIVE-4485 . Beeline prints 
 NULLs as empty strings. This is not a desirable behavior. But if we fix it, 
 it breaks backward compatibility. 
 But we should not be burdening all users with mistakes of the past, specially 
 the users who are new to hive. As hadoop and hive adoption increases 
 proportion of 'new' users will continue to increase.
 We need a way to let users choose between backward compatible behavior and 
 more sensible behavior.  How this is implemented can be discussed in a 
 separate jira. 
 The purpose of this *Task* jira is just to collect list of config flags whose 
 current default is not the desirable one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5875) task : collect list of hive configuration params whose default should change

2013-11-22 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830405#comment-13830405
 ] 

Thejas M Nair commented on HIVE-5875:
-

 hive.enforce.bucketing = true . (When somebody creates a table with bucketing 
, I don't see any reason why they won't want bucketing to be on by default).


 task : collect list of hive configuration params whose default should change
 

 Key: HIVE-5875
 URL: https://issues.apache.org/jira/browse/HIVE-5875
 Project: Hive
  Issue Type: Task
Reporter: Thejas M Nair
Assignee: Thejas M Nair

 The immediate motivation for this was the ticket HIVE-4485 . Beeline prints 
 NULLs as empty strings. This is not a desirable behavior. But if we fix it, 
 it breaks backward compatibility. 
 But we should not be burdening all users with mistakes of the past, specially 
 the users who are new to hive. As hadoop and hive adoption increases 
 proportion of 'new' users will continue to increase.
 We need a way to let users choose between backward compatible behavior and 
 more sensible behavior.  How this is implemented can be discussed in a 
 separate jira. 
 The purpose of this *Task* jira is just to collect list of config flags whose 
 current default is not the desirable one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5839) BytesRefArrayWritable compareTo violates contract

2013-11-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830416#comment-13830416
 ] 

Hive QA commented on HIVE-5839:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12615394/HIVE-5839.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4652 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.io.TestRCFile.testWriteAndPartialRead
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/402/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/402/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12615394

 BytesRefArrayWritable compareTo violates contract
 -

 Key: HIVE-5839
 URL: https://issues.apache.org/jira/browse/HIVE-5839
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0, 0.12.0
Reporter: Ian Robertson
Assignee: Xuefu Zhang
 Attachments: HIVE-5839.patch, HIVE-5839.patch


 BytesRefArrayWritable's compareTo violates the compareTo contract from 
 java.lang.Object. Specifically:
 * The implementor must ensure sgn(x.compareTo( y )) == -sgn(y.compareTo( x )) 
 for all x and y.
 The compareTo implementation on BytesRefArrayWritable does a proper 
 comparison of the sizes of the two instances. However, if the sizes are the 
 same, it proceeds to do a check if both array's have the same constant. If 
 not, it returns 1. This means that if x and y are two BytesRefArrayWritable 
 instances with the same size, but different contents, then x.compareTo( y ) 
 == 1 and y.compareTo( x ) == 1.
 Additionally, the comparison of contents is order agnostic. This seems wrong, 
 since order of entries should matter. It is also very inefficient, running at 
 O(n^2), where n is the number of entries.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken

2013-11-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830430#comment-13830430
 ] 

Ashutosh Chauhan commented on HIVE-5817:


[~ehans] I tried your query. With that I can repro this on a 1-node cluster, 
but when I use this query in .q file and run mvn test, test actually passes. 
Any idea why?

 column name to index mapping in VectorizationContext is broken
 --

 Key: HIVE-5817
 URL: https://issues.apache.org/jira/browse/HIVE-5817
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Sergey Shelukhin
Assignee: Remus Rusanu
Priority: Critical
 Attachments: HIVE-5817-uniquecols.broken.patch, 
 HIVE-5817.00-broken.patch


 Columns coming from different operators may have the same internal names 
 (_colNN). There exists a query in the form {{select b.cb, a.ca from a JOIN 
 b ON ... JOIN x ON ...;}}  (distilled from a more complex query), which runs 
 ok w/o vectorization. With vectorization, it will run ok for most ca, but for 
 some ca it will fail (or can probably return incorrect results). That is 
 because when building column-to-VRG-index map in VectorizationContext, 
 internal column name for ca that the first map join operator adds to the 
 mapping may be the same as internal name for cb that the 2nd one tries to 
 add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to 
 output stuff, it retrieves wrong index from the map by name, and then wrong 
 vector from VRG.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5833) Remove versions from child module dependencies

2013-11-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830442#comment-13830442
 ] 

Hive QA commented on HIVE-5833:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12614944/HIVE-5833.2.patch

{color:green}SUCCESS:{color} +1 4680 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/403/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/403/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12614944

 Remove versions from child module dependencies
 --

 Key: HIVE-5833
 URL: https://issues.apache.org/jira/browse/HIVE-5833
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
 Attachments: HIVE-5833.2.patch, HIVE-5833.patch


 HIVE-5741 moved all dependencies to the plugin management section of the 
 parent pom therefore we can remove 
 {noformat}version${dep.version}/version{noformat} from all dependencies 
 in child modules.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15718: HIVE-5614: Subquery support: allow subquery expressions in having clause

2013-11-22 Thread Harish Butani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15718/
---

(Updated Nov. 23, 2013, 12:14 a.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

Thanks for the feedback.
attempted to address all the feedback.


Bugs: HIVE-5614
https://issues.apache.org/jira/browse/HIVE-5614


Repository: hive-git


Description
---

support for subquery predicates in having clause. SubTask of HIVE-784


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java fa111cc 
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java 3e8215d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7979873 
  ql/src/test/queries/clientpositive/subquery_exists_having.q PRE-CREATION 
  ql/src/test/queries/clientpositive/subquery_in_having.q PRE-CREATION 
  ql/src/test/queries/clientpositive/subquery_notexists_having.q PRE-CREATION 
  ql/src/test/queries/clientpositive/subquery_notin_having.q PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_exists_having.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_in_having.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_multiinsert.q.out 8dfb485 
  ql/src/test/results/clientpositive/subquery_notexists_having.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_notin_having.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/15718/diff/


Testing
---

added new tests: subquery_in_having.q, subquery_notin_having.q, 
subquery_exists_having.q, subquery_notexists_having.q


Thanks,

Harish Butani



[jira] [Updated] (HIVE-5614) Subquery support: allow subquery expressions in having clause

2013-11-22 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-5614:


Attachment: HIVE-5614.4.patch

 Subquery support: allow subquery expressions in having clause
 -

 Key: HIVE-5614
 URL: https://issues.apache.org/jira/browse/HIVE-5614
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5614.1.patch, HIVE-5614.2.patch, HIVE-5614.3.patch, 
 HIVE-5614.4.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15718: HIVE-5614: Subquery support: allow subquery expressions in having clause

2013-11-22 Thread Harish Butani


 On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java, line 131
  https://reviews.apache.org/r/15718/diff/1/?file=388900#file388900line131
 
  It will be good to add a comment about various fields in Conjunct class.

done


 On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java, line 138
  https://reviews.apache.org/r/15718/diff/1/?file=388900#file388900line138
 
  Can this constructor be package protected instead of public ?

done


 On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java, line 203
  https://reviews.apache.org/r/15718/diff/1/?file=388900#file388900line203
 
  It will be good to add a comment, how behavior of ConjunctAnalyzer 
  changes when forHavingClause = true instead of false.

done


 On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java, line 298
  https://reviews.apache.org/r/15718/diff/1/?file=388900#file388900line298
 
  Should this exception needs to be propagated up the stack. At the 
  least, we should have LOG.warn() message here.

this is not an error. This is only a check if the expression is in the 
OuterQuery RR


 On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 1916
  https://reviews.apache.org/r/15718/diff/1/?file=388901#file388901line1916
 
  It will be good to add a comment here along the lines of
   there could be a subq in having clause, if so we need to generate subq 
  plan followed by semi-join.

done


 On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 1926
  https://reviews.apache.org/r/15718/diff/1/?file=388901#file388901line1926
 
  It will be good to add a comment how this new boolean changes behavior 
  of this method.

done


 On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
  ql/src/test/queries/clientpositive/subquery_in_having.q, line 25
  https://reviews.apache.org/r/15718/diff/1/?file=388903#file388903line25
 
  It will be good to add a test which has a subq in both where clause as 
  well as having clause

done


 On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
  ql/src/test/queries/clientpositive/subquery_in_having.q, line 63
  https://reviews.apache.org/r/15718/diff/1/?file=388903#file388903line63
 
  Same comment w.r.t map-join on. Also, if we support over clause in 
  subq, it will be good to have a test for that.

done


 On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
  ql/src/test/queries/clientpositive/subquery_notexists_having.q, line 46
  https://reviews.apache.org/r/15718/diff/1/?file=388904#file388904line46
 
  It will be good to add a negative test where subq and outer query both 
  uses same table alias. It seems in such cases we may generate incorrect 
  results, so we should disable those.

this is not just for having. Added a jira HIVE-5874 to give a better error for 
this case.


 On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
  ql/src/test/results/clientpositive/subquery_in_having.q.out, line 57
  https://reviews.apache.org/r/15718/diff/1/?file=388907#file388907line57
 
  In this plan, we are first computing outq, then subq and then doing 
  left semi-join on resultset of those two. As we discussed efficient way for 
  this is to push filter conditions in subq to outer query to cut-down the 
  output generated by outq. Though, I am not sure whether its better to do it 
  in optimizer phase via Transformer or right here. Either ways, I think 
  thats an optimization which we can do as a follow-up.

agreed


 On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
  ql/src/test/results/clientpositive/subquery_notexists_having.q.out, line 149
  https://reviews.apache.org/r/15718/diff/1/?file=388909#file388909line149
 
  First expression in this filter is redundant. Thats not strictly 
  required. However, since there is an active work going on for constant 
  folding optimization, this may get optimized way via that optimization. 
  Either way, this can be done in follow-up.

(1=1) is added as a placeholder. Yes it should be removed.


- Harish


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15718/#review29298
---


On Nov. 23, 2013, 12:14 a.m., Harish Butani wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15718/
 ---
 
 (Updated Nov. 23, 2013, 12:14 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Bugs: HIVE-5614
 

[jira] [Updated] (HIVE-5614) Subquery support: allow subquery expressions in having clause

2013-11-22 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-5614:


Status: Patch Available  (was: Open)

 Subquery support: allow subquery expressions in having clause
 -

 Key: HIVE-5614
 URL: https://issues.apache.org/jira/browse/HIVE-5614
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5614.1.patch, HIVE-5614.2.patch, HIVE-5614.3.patch, 
 HIVE-5614.4.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5614) Subquery support: allow subquery expressions in having clause

2013-11-22 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830446#comment-13830446
 ] 

Harish Butani commented on HIVE-5614:
-

update based on feedback from  [~ashutoshc].

 Subquery support: allow subquery expressions in having clause
 -

 Key: HIVE-5614
 URL: https://issues.apache.org/jira/browse/HIVE-5614
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5614.1.patch, HIVE-5614.2.patch, HIVE-5614.3.patch, 
 HIVE-5614.4.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5876) Split elimination in ORC breaks for partitioned tables

2013-11-22 Thread Prasanth J (JIRA)
Prasanth J created HIVE-5876:


 Summary: Split elimination in ORC breaks for partitioned tables
 Key: HIVE-5876
 URL: https://issues.apache.org/jira/browse/HIVE-5876
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J


HIVE-5632 eliminates ORC stripes from split computation that do not satisfy 
SARG condition. SARG expression can also refer to partition columns. But 
partition column will not be contained in the column names list in ORC file. 
This was causing ArrayIndexOutOfBoundException in split elimination logic when 
used with partitioned tables. The fix is to ignore evaluation of partition 
column expressions in split elimination.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5877) Implement vectorized support for IN as boolean-valued expression

2013-11-22 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-5877:
-

 Summary: Implement vectorized support for IN as boolean-valued 
expression
 Key: HIVE-5877
 URL: https://issues.apache.org/jira/browse/HIVE-5877
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Eric Hanson


Implement support for IN as a Boolean-valued expression, e..g.

select col1 IN (1, 2, 3) from T;

or 

select col1
from T
where NOT (col1 IN (1, 2, 3));

This will also automatically add support for NOT IN because NOT IN is 
automatically transformed into NOT ( ... IN ... ) by the parser.

 




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5758) Implement vectorized support for NOT IN filter

2013-11-22 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830452#comment-13830452
 ] 

Eric Hanson commented on HIVE-5758:
---

It turns out that the parser transforms

col NOT IN (list)

to

NOT (col IN (list))

So when support for a IN as a Boolean expression is added, this should just 
work.

 Implement vectorized support for NOT IN filter
 --

 Key: HIVE-5758
 URL: https://issues.apache.org/jira/browse/HIVE-5758
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson

 Implement full, end-to-end support for NOT IN in vectorized mode, including 
 new VectorExpression class(es), VectorizationContext translation to a 
 VectorExpression, and unit tests for these, as well as end-to-end ad hoc 
 testing. An end-to-end .q test is recommended.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5876) Split elimination in ORC breaks for partitioned tables

2013-11-22 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-5876:
-

Attachment: HIVE-5876.1.patch

 Split elimination in ORC breaks for partitioned tables
 --

 Key: HIVE-5876
 URL: https://issues.apache.org/jira/browse/HIVE-5876
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-5876.1.patch


 HIVE-5632 eliminates ORC stripes from split computation that do not satisfy 
 SARG condition. SARG expression can also refer to partition columns. But 
 partition column will not be contained in the column names list in ORC file. 
 This was causing ArrayIndexOutOfBoundException in split elimination logic 
 when used with partitioned tables. The fix is to ignore evaluation of 
 partition column expressions in split elimination.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken

2013-11-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830459#comment-13830459
 ] 

Ashutosh Chauhan commented on HIVE-5817:


Aah.. I missed one config which is required to repro in .q file {{ set 
hive.auto.convert.join=true; }} Thanks [~ehans] for test-case.

 column name to index mapping in VectorizationContext is broken
 --

 Key: HIVE-5817
 URL: https://issues.apache.org/jira/browse/HIVE-5817
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Sergey Shelukhin
Assignee: Remus Rusanu
Priority: Critical
 Attachments: HIVE-5817-uniquecols.broken.patch, 
 HIVE-5817.00-broken.patch


 Columns coming from different operators may have the same internal names 
 (_colNN). There exists a query in the form {{select b.cb, a.ca from a JOIN 
 b ON ... JOIN x ON ...;}}  (distilled from a more complex query), which runs 
 ok w/o vectorization. With vectorization, it will run ok for most ca, but for 
 some ca it will fail (or can probably return incorrect results). That is 
 because when building column-to-VRG-index map in VectorizationContext, 
 internal column name for ca that the first map join operator adds to the 
 mapping may be the same as internal name for cb that the 2nd one tries to 
 add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to 
 output stuff, it retrieves wrong index from the map by name, and then wrong 
 vector from VRG.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken

2013-11-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830465#comment-13830465
 ] 

Ashutosh Chauhan commented on HIVE-5817:


[~rusanu] Another approach I am mulling over is to prepend table alias in the 
column name in that map. That way keys in that map will be unique across 
different tables, so that they won't collide. Change there is all the callers 
also need to prepend table alias than, but since all of them have ExprNodeDesc 
which has table alias, this should work out fine. What do you think ?

 column name to index mapping in VectorizationContext is broken
 --

 Key: HIVE-5817
 URL: https://issues.apache.org/jira/browse/HIVE-5817
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Sergey Shelukhin
Assignee: Remus Rusanu
Priority: Critical
 Attachments: HIVE-5817-uniquecols.broken.patch, 
 HIVE-5817.00-broken.patch


 Columns coming from different operators may have the same internal names 
 (_colNN). There exists a query in the form {{select b.cb, a.ca from a JOIN 
 b ON ... JOIN x ON ...;}}  (distilled from a more complex query), which runs 
 ok w/o vectorization. With vectorization, it will run ok for most ca, but for 
 some ca it will fail (or can probably return incorrect results). That is 
 because when building column-to-VRG-index map in VectorizationContext, 
 internal column name for ca that the first map join operator adds to the 
 mapping may be the same as internal name for cb that the 2nd one tries to 
 add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to 
 output stuff, it retrieves wrong index from the map by name, and then wrong 
 vector from VRG.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5614) Subquery support: allow subquery expressions in having clause

2013-11-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830471#comment-13830471
 ] 

Ashutosh Chauhan commented on HIVE-5614:


+1

 Subquery support: allow subquery expressions in having clause
 -

 Key: HIVE-5614
 URL: https://issues.apache.org/jira/browse/HIVE-5614
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5614.1.patch, HIVE-5614.2.patch, HIVE-5614.3.patch, 
 HIVE-5614.4.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5849) Improve the stats of operators based on heuristics in the absence of any column statistics

2013-11-22 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-5849:
-

Attachment: (was: HIVE-5849.7.patch)

 Improve the stats of operators based on heuristics in the absence of any 
 column statistics
 --

 Key: HIVE-5849
 URL: https://issues.apache.org/jira/browse/HIVE-5849
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor, Statistics
Reporter: Prasanth J
Assignee: Prasanth J
 Fix For: 0.13.0

 Attachments: HIVE-5849.1.patch.txt, HIVE-5849.2.patch.txt, 
 HIVE-5849.3.patch, HIVE-5849.3.patch.txt, HIVE-5849.4.javaonly.patch, 
 HIVE-5849.5.patch, HIVE-5849.6.patch


 In the absence of any column statistics, operators will simply use the 
 statistics from its parents. It is useful to apply some heuristics to update 
 basic statistics (number of rows and data size) in the absence of any column 
 statistics. This will be worst case scenario.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5849) Improve the stats of operators based on heuristics in the absence of any column statistics

2013-11-22 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-5849:
-

Attachment: HIVE-5849.7.patch

Reuploading patch for precommit test to pickup.

 Improve the stats of operators based on heuristics in the absence of any 
 column statistics
 --

 Key: HIVE-5849
 URL: https://issues.apache.org/jira/browse/HIVE-5849
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor, Statistics
Reporter: Prasanth J
Assignee: Prasanth J
 Fix For: 0.13.0

 Attachments: HIVE-5849.1.patch.txt, HIVE-5849.2.patch.txt, 
 HIVE-5849.3.patch, HIVE-5849.3.patch.txt, HIVE-5849.4.javaonly.patch, 
 HIVE-5849.5.patch, HIVE-5849.6.patch, HIVE-5849.7.patch


 In the absence of any column statistics, operators will simply use the 
 statistics from its parents. It is useful to apply some heuristics to update 
 basic statistics (number of rows and data size) in the absence of any column 
 statistics. This will be worst case scenario.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 15816: HIVE-3181 getDatabaseMajor/Minor version does not return values

2013-11-22 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15816/
---

Review request for hive.


Bugs: HIVE-3181
https://issues.apache.org/jira/browse/HIVE-3181


Repository: hive-git


Description
---

This will parse the database version to determine the major and minor versions.


Diffs
-

  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
7b1c9da 
  jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java c447d44 
  jdbc/src/java/org/apache/hive/jdbc/Utils.java 4d75d98 

Diff: https://reviews.apache.org/r/15816/diff/


Testing
---


Thanks,

Szehon Ho



[jira] [Commented] (HIVE-4518) Counter Strike: Operation Operator

2013-11-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830497#comment-13830497
 ] 

Hive QA commented on HIVE-4518:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12615372/HIVE-4518.11.patch

{color:green}SUCCESS:{color} +1 4679 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/404/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/404/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12615372

 Counter Strike: Operation Operator
 --

 Key: HIVE-4518
 URL: https://issues.apache.org/jira/browse/HIVE-4518
 Project: Hive
  Issue Type: Improvement
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4518.1.patch, HIVE-4518.10.patch, 
 HIVE-4518.11.patch, HIVE-4518.2.patch, HIVE-4518.3.patch, HIVE-4518.4.patch, 
 HIVE-4518.5.patch, HIVE-4518.6.patch.txt, HIVE-4518.7.patch, 
 HIVE-4518.8.patch, HIVE-4518.9.patch


 Queries of the form:
 from foo
 insert overwrite table bar partition (p) select ...
 insert overwrite table bar partition (p) select ...
 insert overwrite table bar partition (p) select ...
 Generate a huge amount of counters. The reason is that task.progress is 
 turned on for dynamic partitioning queries.
 The counters not only make queries slower than necessary (up to 50%) you will 
 also eventually run out. That's because we're wrapping them in enum values to 
 comply with hadoop 0.17.
 The real reason we turn task.progress on is that we need CREATED_FILES and 
 FATAL counters to ensure dynamic partitioning queries don't go haywire.
 The counters have counter-intuitive names like C1 through C1000 and don't 
 seem really useful by themselves.
 With hadoop 20+ you don't need to wrap the counters anymore, each operator 
 can simply create and increment counters. That should simplify the code a lot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-3181) getDatabaseMajor/Minor version does not return values

2013-11-22 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-3181:


Status: Open  (was: Patch Available)

 getDatabaseMajor/Minor version does not return values
 -

 Key: HIVE-3181
 URL: https://issues.apache.org/jira/browse/HIVE-3181
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.8.1
Reporter: N Campbell
Assignee: Szehon Ho
 Attachments: HIVE-3181.patch


 This is really a sub-issue of HIVE-3174 (which is a lot of properties) but 
 given that the driver will return databaseProductVersion it makes no sense to 
 not have implemented these as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15816: HIVE-3181 getDatabaseMajor/Minor version does not return values

2013-11-22 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15816/
---

(Updated Nov. 23, 2013, 1:40 a.m.)


Review request for hive.


Changes
---

Made a small optimization to lazily cache the db version number.  This will 
decrease RPC calls to server, if more than one getXXVersion() method is called.


Bugs: HIVE-3181
https://issues.apache.org/jira/browse/HIVE-3181


Repository: hive-git


Description
---

This will parse the database version to determine the major and minor versions.


Diffs (updated)
-

  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
7b1c9da 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java ef39573 
  jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java c447d44 
  jdbc/src/java/org/apache/hive/jdbc/Utils.java 4d75d98 

Diff: https://reviews.apache.org/r/15816/diff/


Testing
---


Thanks,

Szehon Ho



[jira] [Updated] (HIVE-3181) getDatabaseMajor/Minor version does not return values

2013-11-22 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-3181:


Attachment: HIVE-3181.2.patch

Adding a small optimization to this logic to reduce potential number of RPC 
calls.

 getDatabaseMajor/Minor version does not return values
 -

 Key: HIVE-3181
 URL: https://issues.apache.org/jira/browse/HIVE-3181
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.8.1
Reporter: N Campbell
Assignee: Szehon Ho
 Attachments: HIVE-3181.2.patch, HIVE-3181.patch


 This is really a sub-issue of HIVE-3174 (which is a lot of properties) but 
 given that the driver will return databaseProductVersion it makes no sense to 
 not have implemented these as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-3181) getDatabaseMajor/Minor version does not return values

2013-11-22 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-3181:


Status: Patch Available  (was: Open)

 getDatabaseMajor/Minor version does not return values
 -

 Key: HIVE-3181
 URL: https://issues.apache.org/jira/browse/HIVE-3181
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.8.1
Reporter: N Campbell
Assignee: Szehon Ho
 Attachments: HIVE-3181.2.patch, HIVE-3181.patch


 This is really a sub-issue of HIVE-3174 (which is a lot of properties) but 
 given that the driver will return databaseProductVersion it makes no sense to 
 not have implemented these as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15804: HIVE-5866: Hive divide operator generates wrong results in certain cases

2013-11-22 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15804/#review29330
---



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java
https://reviews.apache.org/r/15804/#comment56517

So I think the issue here is that when  (integer_prec + scale)  
max_precision, we prioritize keeping the scale at the expense of the integer 
portion of the result type. Looks like the SQL Server precision/scale rules 
mention that it does not let the scale eat into the integer portion of the 
result type - it goes the other way and will reduce the scale to allow the 
total precision to fit within max_precision. This might be a better rule to 
follow than prioritizing the scale value, at least for the purposes of 
determining the return type.


- Jason Dere


On Nov. 22, 2013, 9:42 p.m., Xuefu Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15804/
 ---
 
 (Updated Nov. 22, 2013, 9:42 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-5866
 https://issues.apache.org/jira/browse/HIVE-5866
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Fixed the problem. Added a unit test. Corrected the output of a few q tests.
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java 
 a1015e9 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java 
 0b902e9 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMinus.java 
 538c07e 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMod.java 
 472e1dd 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMultiply.java 
 2e8d364 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPPlus.java 
 35f639e 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPosMod.java 
 6b18303 
   
 ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFOPDivide.java 
 581c1a8 
   ql/src/test/results/clientpositive/decimal_precision.q.out 2ee3578 
   ql/src/test/results/clientpositive/decimal_udf.q.out ed5bc65 
   ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out 83787ee 
   ql/src/test/results/clientpositive/vectorization_5.q.out 54aad90 
   ql/src/test/results/clientpositive/vectorization_short_regress.q.out 
 c9296e1 
 
 Diff: https://reviews.apache.org/r/15804/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Xuefu Zhang
 




[jira] [Commented] (HIVE-5866) Hive divide operator generates wrong results in certain cases

2013-11-22 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830511#comment-13830511
 ] 

Jason Dere commented on HIVE-5866:
--

Ok, I understand the issue now - the integer portion of the result type was 
(7,6) so only 1 integer digit, and trying to cast both operands to decimal(7,6) 
which would result in null for 24.
Changes look good, left a comment on RB.

 Hive divide operator generates wrong results in certain cases
 -

 Key: HIVE-5866
 URL: https://issues.apache.org/jira/browse/HIVE-5866
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-5866.1.patch, HIVE-5866.2.patch, HIVE-5866.patch


 Current GenericUDFOPDivide seems having a bug. The following query generates 
 NULL result.
 {code}
 hive select 4BD / 25BD  from test limit 1;
 ...
 Total MapReduce CPU Time Spent: 890 msec
 OK
 NULL
 Time taken: 7.901 seconds, Fetched: 1 row(s)
 {code}
 The correct result should be 0.16 in this query.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5849) Improve the stats of operators based on heuristics in the absence of any column statistics

2013-11-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830520#comment-13830520
 ] 

Hive QA commented on HIVE-5849:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12615428/HIVE-5849.7.patch

{color:green}SUCCESS:{color} +1 4680 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/406/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/406/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12615428

 Improve the stats of operators based on heuristics in the absence of any 
 column statistics
 --

 Key: HIVE-5849
 URL: https://issues.apache.org/jira/browse/HIVE-5849
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor, Statistics
Reporter: Prasanth J
Assignee: Prasanth J
 Fix For: 0.13.0

 Attachments: HIVE-5849.1.patch.txt, HIVE-5849.2.patch.txt, 
 HIVE-5849.3.patch, HIVE-5849.3.patch.txt, HIVE-5849.4.javaonly.patch, 
 HIVE-5849.5.patch, HIVE-5849.6.patch, HIVE-5849.7.patch


 In the absence of any column statistics, operators will simply use the 
 statistics from its parents. It is useful to apply some heuristics to update 
 basic statistics (number of rows and data size) in the absence of any column 
 statistics. This will be worst case scenario.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5870) Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2

2013-11-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830532#comment-13830532
 ] 

Hive QA commented on HIVE-5870:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12615375/HIVE-5870.patch

{color:green}SUCCESS:{color} +1 4680 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/407/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/407/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12615375

 Move TestJDBCDriver2.testNewConnectionConfiguration to TestJDBCWithMiniHS2
 --

 Key: HIVE-5870
 URL: https://issues.apache.org/jira/browse/HIVE-5870
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-5870.patch


 TestJDBCDriver2.testNewConnectionConfiguration() attempts to start a 
 Hiveserver2 instance in the test.
 This can cause issues as creating HiveServer2 needs correct environment/path. 
  This test should be moved to TestJdbcWithMiniHS2, which uses MiniHS2.  
 MiniHS2 is for this purpose (setting all the environment properly before 
 starting HiveServer2 instance).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Hive-trunk-hadoop2 - Build # 565 - Still Failing

2013-11-22 Thread Apache Jenkins Server
Changes for Build #535
[thejas] HIVE-5547 : webhcat pig job submission should ship hive tar if 
-usehcatalog is specified (Eugene Koifman via Thejas Nair)

[thejas] HIVE-5715 : HS2 should not start a session for every command 
(Gunther Hagleitner via Thejas Nair)


Changes for Build #536

Changes for Build #537
[brock] HIVE-5740: Tar files should extract to the directory of the same name 
minus tar.gz (Brock Noland reviewed by Xuefu Zhang)

[brock] HIVE-5611: Add assembly (i.e.) tar creation to pom (Szehon Ho via Brock 
Noland)

[brock] HIVE-5707: Validate values for ConfVar (Navis via Brock Noland)

[brock] HIVE-5721: Incremental build is disabled by MCOMPILER-209 (Navis via 
Brock Noland)


Changes for Build #538
[brock] HIVE-5354 - Decimal precision/scale support in ORC file (Xuefu Zhang 
via Brock Noland)

[brock] HIVE-4523 - round() function with specified decimal places not 
consistent with mysql (Xuefu Zhang via Brock Noland)

[thejas] HIVE-5542 : Webhcat is failing to run ddl command on a secure cluster 
(Sushanth Sowmyan via Thejas Nair)


Changes for Build #539
[brock] HIVE-5730: Beeline throws non-terminal NPE upon starting, after 
mavenization (Szehon Ho reviewed by Navis)


Changes for Build #540
[omalley] HIVE-5425 Provide a configuration option to control the default stripe
size for ORC. (omalley reviewed by gunther)

[omalley] Revert HIVE-5583 since it broke the build.

[hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in 
vectorized mode (Eric Hanson via Ashutosh Chauhan)

[brock] HIVE-5355 - JDBC support for decimal precision/scale


Changes for Build #541
[hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in 
vectorized mode (Eric Hanson via Ashutosh Chauhan)

[brock] HIVE-5773 - Fix build due to conflict between HIVE-5711 and HIVE-5713

[brock] HIVE-5711 - Fix eclipse:eclipse maven goal (Carl Steinbach via Brock 
Noland)

[brock] HIVE-5752 - log4j properties appear to have been lost in maven upgrade 
(Sergey Shelukhin via Brock Noland)

[brock] HIVE-5713 - Verify versions of libraries post maven merge (Brock Noland 
reviewed by Gunther Hagleitner)

[brock] HIVE-5765 - Beeline throws NPE when -e option is used (Szehon Ho via 
Brock Noland)

[xuefu] HIVE-5726: The DecimalTypeInfo instance associated with a decimal 
constant is not in line with the precision/scale of the constant (reviewed by 
Brock)

[xuefu] HIVE-5655: Hive incorrecly handles divide-by-zero case (reviewed by 
Edward and Brock)

[xuefu] HIVE-5191: Add char data type (Jason via Xuefu)


Changes for Build #542
[brock] HIVE-5351 - Secure-Socket-Layer (SSL) support for HiveServer2 (Prasad 
Mujumdar via Brock Noland)


Changes for Build #543
[brock] HIVE-5780 - Add the missing declaration of HIVE_CLI_SERVICE_PROTOCOL_V4 
in TCLIService.thrift (Prasad Mujumdar via Brock Noland)


Changes for Build #544
[gunther] HIVE-5601: NPE in ORC's PPD when using select * from table with where 
predicate (Prasanth J via Owen O'Malley and Gunther Hagleitner)

[gunther] HIVE-5562: Provide stripe level column statistics in ORC (Patch by 
Prasanth J, reviewed by Owen O'Malley, committed by Gunther Hagleitner)

[hashutosh] HIVE-3777 : add a property in the partition to figure out if stats 
are accurate (Ashutosh Chauhan via Thejas Nair)


Changes for Build #545
[hashutosh] HIVE-5691 : Intermediate columns are incorrectly initialized for 
partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner)

[hashutosh] HIVE-5779 : Subquery in where clause with distinct fails with 
mapjoin turned on with serialization error. (Ashutosh Chauhan via Harish Butani)

[gunther] HIVE-5632 (partial): Adding test data to data/files to enable 
pre-commit tests to run. (Prasanth J via Gunther Hagleitner)


Changes for Build #546
[cws] HIVE-5786: Remove HadoopShims methods that were needed for pre-Hadoop 
0.20 (Jason Dere via cws)

[thejas] HIVE-5229 : Better thread management for HiveServer2 async threads 
(Vaibhav Gumashta via Thejas Nair)

[gunther] HIVE-5745: TestHiveLogging is failing (at least on mac) (Gunther 
Hagleitner, reviewed by Ashutosh Chauhan)


Changes for Build #547
[hashutosh] HIVE-5699 : Add unit test for vectorized BETWEEN for timestamp 
inputs (Eric Hanson via Ashutosh Chauhan)

[hashutosh] HIVE-5767 : in SemanticAnalyzer#doPhase1, handling for TOK_UNION 
falls thru into TOK_INSERT (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5657 : TopN produces incorrect results with count(distinct) 
(Sergey Shelukhin via Ashutosh Chauhan)


Changes for Build #549
[hashutosh] HIVE-5753 : Remove collector from Operator base class (Mohammad 
Islam via Ashutosh Chauhan)

[hashutosh] HIVE-5737 : Provide StructObjectInspector for UDTFs rather than 
ObjectInspect[] (Navis via Ashutosh Chauhan)

[hashutosh] HIVE-5790 : maven test build  failure shows wrong error message 
(Mohammad Islam via Ashutosh Chauhan)

[hashutosh] HIVE-5722 : Skip generating vectorization code if possible (Navis 
via Brock 

Hive-trunk-h0.21 - Build # 2466 - Still Failing

2013-11-22 Thread Apache Jenkins Server
Changes for Build #2436
[thejas] HIVE-5547 : webhcat pig job submission should ship hive tar if 
-usehcatalog is specified (Eugene Koifman via Thejas Nair)

[thejas] HIVE-5715 : HS2 should not start a session for every command 
(Gunther Hagleitner via Thejas Nair)


Changes for Build #2437

Changes for Build #2438
[brock] HIVE-5740: Tar files should extract to the directory of the same name 
minus tar.gz (Brock Noland reviewed by Xuefu Zhang)

[brock] HIVE-5611: Add assembly (i.e.) tar creation to pom (Szehon Ho via Brock 
Noland)

[brock] HIVE-5707: Validate values for ConfVar (Navis via Brock Noland)

[brock] HIVE-5721: Incremental build is disabled by MCOMPILER-209 (Navis via 
Brock Noland)


Changes for Build #2439
[brock] HIVE-5354 - Decimal precision/scale support in ORC file (Xuefu Zhang 
via Brock Noland)

[brock] HIVE-4523 - round() function with specified decimal places not 
consistent with mysql (Xuefu Zhang via Brock Noland)

[thejas] HIVE-5542 : Webhcat is failing to run ddl command on a secure cluster 
(Sushanth Sowmyan via Thejas Nair)


Changes for Build #2440
[brock] HIVE-5730: Beeline throws non-terminal NPE upon starting, after 
mavenization (Szehon Ho reviewed by Navis)


Changes for Build #2441
[omalley] HIVE-5425 Provide a configuration option to control the default stripe
size for ORC. (omalley reviewed by gunther)

[omalley] Revert HIVE-5583 since it broke the build.

[hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in 
vectorized mode (Eric Hanson via Ashutosh Chauhan)

[brock] HIVE-5355 - JDBC support for decimal precision/scale


Changes for Build #2443
[brock] HIVE-5351 - Secure-Socket-Layer (SSL) support for HiveServer2 (Prasad 
Mujumdar via Brock Noland)

[hashutosh] HIVE-5583 : Implement support for IN (list-of-constants) filter in 
vectorized mode (Eric Hanson via Ashutosh Chauhan)

[brock] HIVE-5773 - Fix build due to conflict between HIVE-5711 and HIVE-5713

[brock] HIVE-5711 - Fix eclipse:eclipse maven goal (Carl Steinbach via Brock 
Noland)

[brock] HIVE-5752 - log4j properties appear to have been lost in maven upgrade 
(Sergey Shelukhin via Brock Noland)

[brock] HIVE-5713 - Verify versions of libraries post maven merge (Brock Noland 
reviewed by Gunther Hagleitner)

[brock] HIVE-5765 - Beeline throws NPE when -e option is used (Szehon Ho via 
Brock Noland)

[xuefu] HIVE-5726: The DecimalTypeInfo instance associated with a decimal 
constant is not in line with the precision/scale of the constant (reviewed by 
Brock)

[xuefu] HIVE-5655: Hive incorrecly handles divide-by-zero case (reviewed by 
Edward and Brock)

[xuefu] HIVE-5191: Add char data type (Jason via Xuefu)


Changes for Build #2444
[brock] HIVE-5780 - Add the missing declaration of HIVE_CLI_SERVICE_PROTOCOL_V4 
in TCLIService.thrift (Prasad Mujumdar via Brock Noland)


Changes for Build #2445
[gunther] HIVE-5601: NPE in ORC's PPD when using select * from table with where 
predicate (Prasanth J via Owen O'Malley and Gunther Hagleitner)

[gunther] HIVE-5562: Provide stripe level column statistics in ORC (Patch by 
Prasanth J, reviewed by Owen O'Malley, committed by Gunther Hagleitner)

[hashutosh] HIVE-3777 : add a property in the partition to figure out if stats 
are accurate (Ashutosh Chauhan via Thejas Nair)


Changes for Build #2446
[hashutosh] HIVE-5691 : Intermediate columns are incorrectly initialized for 
partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner)

[hashutosh] HIVE-5779 : Subquery in where clause with distinct fails with 
mapjoin turned on with serialization error. (Ashutosh Chauhan via Harish Butani)

[gunther] HIVE-5632 (partial): Adding test data to data/files to enable 
pre-commit tests to run. (Prasanth J via Gunther Hagleitner)


Changes for Build #2447
[cws] HIVE-5786: Remove HadoopShims methods that were needed for pre-Hadoop 
0.20 (Jason Dere via cws)

[thejas] HIVE-5229 : Better thread management for HiveServer2 async threads 
(Vaibhav Gumashta via Thejas Nair)

[gunther] HIVE-5745: TestHiveLogging is failing (at least on mac) (Gunther 
Hagleitner, reviewed by Ashutosh Chauhan)


Changes for Build #2448
[hashutosh] HIVE-5699 : Add unit test for vectorized BETWEEN for timestamp 
inputs (Eric Hanson via Ashutosh Chauhan)

[hashutosh] HIVE-5767 : in SemanticAnalyzer#doPhase1, handling for TOK_UNION 
falls thru into TOK_INSERT (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5657 : TopN produces incorrect results with count(distinct) 
(Sergey Shelukhin via Ashutosh Chauhan)


Changes for Build #2450
[hashutosh] HIVE-5683 : JDBC support for char (Jason Dere via Xuefu Zhang)

[hashutosh] HIVE-5626 : enable metastore direct SQL for drop/similar queries 
(Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5700 : enforce single date format for partition column storage 
(Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5753 : Remove collector from Operator base class (Mohammad 
Islam via Ashutosh Chauhan)

[hashutosh] HIVE-5737 : 

[jira] [Updated] (HIVE-5876) Split elimination in ORC breaks for partitioned tables

2013-11-22 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-5876:
-

Status: Patch Available  (was: Open)

 Split elimination in ORC breaks for partitioned tables
 --

 Key: HIVE-5876
 URL: https://issues.apache.org/jira/browse/HIVE-5876
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-5876.1.patch


 HIVE-5632 eliminates ORC stripes from split computation that do not satisfy 
 SARG condition. SARG expression can also refer to partition columns. But 
 partition column will not be contained in the column names list in ORC file. 
 This was causing ArrayIndexOutOfBoundException in split elimination logic 
 when used with partitioned tables. The fix is to ignore evaluation of 
 partition column expressions in split elimination.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5839) BytesRefArrayWritable compareTo violates contract

2013-11-22 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-5839:
--

Attachment: HIVE-5839.1.patch

Patch #1 fixed the test failure and added new test case.

 BytesRefArrayWritable compareTo violates contract
 -

 Key: HIVE-5839
 URL: https://issues.apache.org/jira/browse/HIVE-5839
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0, 0.12.0
Reporter: Ian Robertson
Assignee: Xuefu Zhang
 Attachments: HIVE-5839.1.patch, HIVE-5839.patch, HIVE-5839.patch


 BytesRefArrayWritable's compareTo violates the compareTo contract from 
 java.lang.Object. Specifically:
 * The implementor must ensure sgn(x.compareTo( y )) == -sgn(y.compareTo( x )) 
 for all x and y.
 The compareTo implementation on BytesRefArrayWritable does a proper 
 comparison of the sizes of the two instances. However, if the sizes are the 
 same, it proceeds to do a check if both array's have the same constant. If 
 not, it returns 1. This means that if x and y are two BytesRefArrayWritable 
 instances with the same size, but different contents, then x.compareTo( y ) 
 == 1 and y.compareTo( x ) == 1.
 Additionally, the comparison of contents is order agnostic. This seems wrong, 
 since order of entries should matter. It is also very inefficient, running at 
 O(n^2), where n is the number of entries.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 15820: HIVE-5839: BytesRefArrayWritable compareTo violates contract

2013-11-22 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15820/
---

Review request for hive.


Bugs: HIVE-5839
https://issues.apache.org/jira/browse/HIVE-5839


Repository: hive-git


Description
---

Modified according to the contract.


Diffs
-

  ql/src/test/org/apache/hadoop/hive/ql/io/TestRCFile.java d2b06ec 
  
serde/src/java/org/apache/hadoop/hive/serde2/columnar/BytesRefArrayWritable.java
 712064e 
  
serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestBytesRefArrayWritable.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/15820/diff/


Testing
---

 Added new test case. Fixed a old test case.


Thanks,

Xuefu Zhang



Re: [ANNOUNCE] New Hive Committers - Jitendra Nath Pandey and Eric Hanson

2013-11-22 Thread Vaibhav Gumashta
Congrats guys!


On Fri, Nov 22, 2013 at 11:54 PM, Vikram Dixit vik...@hortonworks.comwrote:

 Congrats to both of you!


 On Fri, Nov 22, 2013 at 9:34 AM, Jason Dere jd...@hortonworks.com wrote:

  Congrats!
 
  On Nov 22, 2013, at 2:25 AM, Biswajit Nayak biswajit.na...@inmobi.com
  wrote:
 
   Congrats to both of you..
  
  
   On Fri, Nov 22, 2013 at 1:26 PM, Lefty Leverenz 
 leftylever...@gmail.com
  wrote:
   Congratulations, Jitendra and Eric!  The more the merrier.
  
   -- Lefty
  
  
   On Thu, Nov 21, 2013 at 6:31 PM, Jarek Jarcec Cecho jar...@apache.org
 
  wrote:
   Congratulations, good job!
  
   Jarcec
  
   On Thu, Nov 21, 2013 at 03:29:07PM -0800, Carl Steinbach wrote:
The Apache Hive PMC has voted to make Jitendra Nath Pandey and Eric
  Hanson
committers on the Apache Hive project.
   
Please join me in congratulating Jitendra and Eric!
   
Thanks.
   
Carl
  
  
  
   _
   The information contained in this communication is intended solely for
  the use of the individual or entity to whom it is addressed and others
  authorized to receive it. It may contain confidential or legally
 privileged
  information. If you are not the intended recipient you are hereby
 notified
  that any disclosure, copying, distribution or taking any action in
 reliance
  on the contents of this information is strictly prohibited and may be
  unlawful. If you have received this communication in error, please notify
  us immediately by responding to this email and then delete it from your
  system. The firm is neither liable for the proper and complete
 transmission
  of the information contained in this communication nor for any delay in
 its
  receipt.
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (HIVE-5230) Better error reporting by async threads in HiveServer2

2013-11-22 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830570#comment-13830570
 ] 

Vaibhav Gumashta commented on HIVE-5230:


[~cwsteinbach] [~thejas] If there is any more feedback that you have, I can 
look into it. Thanks!

 Better error reporting by async threads in HiveServer2
 --

 Key: HIVE-5230
 URL: https://issues.apache.org/jira/browse/HIVE-5230
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.12.0, 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-5230.1.patch, HIVE-5230.1.patch, HIVE-5230.2.patch, 
 HIVE-5230.3.patch, HIVE-5230.4.patch, HIVE-5230.6.patch, HIVE-5230.7.patch, 
 HIVE-5230.8.patch


 [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
 for async execution in HS2. When a background thread gets an error, currently 
 the client can only poll for the operation state and also the error with its 
 stacktrace is logged. However, it will be useful to provide a richer error 
 response like thrift API does with TStatus (which is constructed while 
 building a Thrift response object). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5217) Add long polling to asynchronous execution in HiveServer2

2013-11-22 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830573#comment-13830573
 ] 

Vaibhav Gumashta commented on HIVE-5217:


[~cwsteinbach] Look forward to your feedback on this one. Thanks! 

 Add long polling to asynchronous execution in HiveServer2
 -

 Key: HIVE-5217
 URL: https://issues.apache.org/jira/browse/HIVE-5217
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-5217.2.patch, HIVE-5217.3.patch, HIVE-5217.4.patch, 
 HIVE-5217.D12801.2.patch, HIVE-5217.D12801.3.patch, HIVE-5217.D12801.4.patch, 
 HIVE-5217.D12801.5.patch, HIVE-5217.D12801.6.patch


 [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
 for async execution in HS2. The client gets an operation handle which it can 
 poll to check on the operation status. However, the polling frequency is 
 entirely left to the client which can be resource inefficient. Long polling 
 will solve this, by blocking the client request to check the operation status 
 for a configurable amount of time (a new HS2 config) if the data is not 
 available, but responding immediately if the data is available.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15804: HIVE-5866: Hive divide operator generates wrong results in certain cases

2013-11-22 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15804/#review29332
---



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java
https://reviews.apache.org/r/15804/#comment56518

I don't think I get you. Could you give an example about how implementation 
here is different from sql server's? The decimal part of a decimal number is as 
important as the integer part in applications where decimal type is required. 
Otherwise, double might be better. Thus, a decimal number of a certain decimal 
type needs to comply with the type's precision/scale. I don't think we should 
store number 456.78 to a type decimal(6,4), about which we already concluded 
the discussion.


- Xuefu Zhang


On Nov. 22, 2013, 9:42 p.m., Xuefu Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15804/
 ---
 
 (Updated Nov. 22, 2013, 9:42 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-5866
 https://issues.apache.org/jira/browse/HIVE-5866
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Fixed the problem. Added a unit test. Corrected the output of a few q tests.
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java 
 a1015e9 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java 
 0b902e9 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMinus.java 
 538c07e 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMod.java 
 472e1dd 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMultiply.java 
 2e8d364 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPPlus.java 
 35f639e 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPosMod.java 
 6b18303 
   
 ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFOPDivide.java 
 581c1a8 
   ql/src/test/results/clientpositive/decimal_precision.q.out 2ee3578 
   ql/src/test/results/clientpositive/decimal_udf.q.out ed5bc65 
   ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out 83787ee 
   ql/src/test/results/clientpositive/vectorization_5.q.out 54aad90 
   ql/src/test/results/clientpositive/vectorization_short_regress.q.out 
 c9296e1 
 
 Diff: https://reviews.apache.org/r/15804/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Xuefu Zhang
 




[jira] [Commented] (HIVE-5614) Subquery support: allow subquery expressions in having clause

2013-11-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830583#comment-13830583
 ] 

Hive QA commented on HIVE-5614:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12615417/HIVE-5614.4.patch

{color:green}SUCCESS:{color} +1 4684 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/410/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/410/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12615417

 Subquery support: allow subquery expressions in having clause
 -

 Key: HIVE-5614
 URL: https://issues.apache.org/jira/browse/HIVE-5614
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5614.1.patch, HIVE-5614.2.patch, HIVE-5614.3.patch, 
 HIVE-5614.4.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5839) BytesRefArrayWritable compareTo violates contract

2013-11-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830598#comment-13830598
 ] 

Hive QA commented on HIVE-5839:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12615447/HIVE-5839.1.patch

{color:green}SUCCESS:{color} +1 4681 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/411/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/411/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12615447

 BytesRefArrayWritable compareTo violates contract
 -

 Key: HIVE-5839
 URL: https://issues.apache.org/jira/browse/HIVE-5839
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0, 0.12.0
Reporter: Ian Robertson
Assignee: Xuefu Zhang
 Attachments: HIVE-5839.1.patch, HIVE-5839.patch, HIVE-5839.patch


 BytesRefArrayWritable's compareTo violates the compareTo contract from 
 java.lang.Object. Specifically:
 * The implementor must ensure sgn(x.compareTo( y )) == -sgn(y.compareTo( x )) 
 for all x and y.
 The compareTo implementation on BytesRefArrayWritable does a proper 
 comparison of the sizes of the two instances. However, if the sizes are the 
 same, it proceeds to do a check if both array's have the same constant. If 
 not, it returns 1. This means that if x and y are two BytesRefArrayWritable 
 instances with the same size, but different contents, then x.compareTo( y ) 
 == 1 and y.compareTo( x ) == 1.
 Additionally, the comparison of contents is order agnostic. This seems wrong, 
 since order of entries should matter. It is also very inefficient, running at 
 O(n^2), where n is the number of entries.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-3181) getDatabaseMajor/Minor version does not return values

2013-11-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830610#comment-13830610
 ] 

Hive QA commented on HIVE-3181:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12615431/HIVE-3181.2.patch

{color:green}SUCCESS:{color} +1 4680 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/413/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/413/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12615431

 getDatabaseMajor/Minor version does not return values
 -

 Key: HIVE-3181
 URL: https://issues.apache.org/jira/browse/HIVE-3181
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.8.1
Reporter: N Campbell
Assignee: Szehon Ho
 Attachments: HIVE-3181.2.patch, HIVE-3181.patch


 This is really a sub-issue of HIVE-3174 (which is a lot of properties) but 
 given that the driver will return databaseProductVersion it makes no sense to 
 not have implemented these as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)