[jira] [Commented] (HIVE-3430) group by followed by join with the same key should be optimized

2013-03-01 Thread Lianhui Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590321#comment-13590321
 ] 

Lianhui Wang commented on HIVE-3430:


also should consider the following query:
SELECT a.key, a.cnt, b.key, a.cnt
FROM
(SELECT x.key as key, count(x.value) AS cnt FROM src x group by x.key) a
JOIN src b
ON (a.key = b.key);


 group by followed by join with the same key should be optimized
 ---

 Key: HIVE-3430
 URL: https://issues.apache.org/jira/browse/HIVE-3430
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4095) Add exchange partition in Hive

2013-03-01 Thread Namit Jain (JIRA)
Namit Jain created HIVE-4095:


 Summary: Add exchange partition in Hive
 Key: HIVE-4095
 URL: https://issues.apache.org/jira/browse/HIVE-4095
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain


It would very useful to support exchange partition in hive, something similar
to http://www.orafaq.com/node/2570 in Oracle.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4095) Add exchange partition in Hive

2013-03-01 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590406#comment-13590406
 ] 

Namit Jain commented on HIVE-4095:
--

More details can be found at: 
https://cwiki.apache.org/confluence/display/Hive/Exchange+Partition

 Add exchange partition in Hive
 --

 Key: HIVE-4095
 URL: https://issues.apache.org/jira/browse/HIVE-4095
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain

 It would very useful to support exchange partition in hive, something similar
 to http://www.orafaq.com/node/2570 in Oracle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4095) Add exchange partition in Hive

2013-03-01 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590410#comment-13590410
 ] 

Namit Jain commented on HIVE-4095:
--

It might be easier to have a syntax closer to 
http://www.techrepublic.com/blog/datacenter/partition-switching-in-sql-server-2005/143

 Add exchange partition in Hive
 --

 Key: HIVE-4095
 URL: https://issues.apache.org/jira/browse/HIVE-4095
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain

 It would very useful to support exchange partition in hive, something similar
 to http://www.orafaq.com/node/2570 in Oracle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Issue Comment Deleted] (HIVE-4095) Add exchange partition in Hive

2013-03-01 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4095:
-

Comment: was deleted

(was: It might be easier to have a syntax closer to 
http://www.techrepublic.com/blog/datacenter/partition-switching-in-sql-server-2005/143)

 Add exchange partition in Hive
 --

 Key: HIVE-4095
 URL: https://issues.apache.org/jira/browse/HIVE-4095
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain

 It would very useful to support exchange partition in hive, something similar
 to http://www.orafaq.com/node/2570 in Oracle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4096) problem in hive.map.groupby.sorted with distincts

2013-03-01 Thread Namit Jain (JIRA)
Namit Jain created HIVE-4096:


 Summary: problem in hive.map.groupby.sorted with distincts
 Key: HIVE-4096
 URL: https://issues.apache.org/jira/browse/HIVE-4096
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain


set hive.enforce.bucketing = true;
set hive.enforce.sorting = true;
set hive.exec.reducers.max = 10;
set hive.map.groupby.sorted=true;

CREATE TABLE T1(key STRING, val STRING) PARTITIONED BY (ds string)
CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;

LOAD DATA LOCAL INPATH '../data/files/T1.txt' INTO TABLE T1 PARTITION (ds='1');

-- perform an insert to make sure there are 2 files
INSERT OVERWRITE TABLE T1 PARTITION (ds='1') select key, val from T1 where ds = 
'1';

CREATE TABLE outputTbl1(cnt INT);

-- The plan should be converted to a map-side group by, since the
-- sorting columns and grouping columns match, and all the bucketing columns
-- are part of sorting columns
EXPLAIN
select count(distinct key) from T1;

select count(distinct key) from T1;

explain
INSERT OVERWRITE TABLE outputTbl1
select count(distinct key) from T1;

INSERT OVERWRITE TABLE outputTbl1
select count(distinct key) from T1;

SELECT * FROM outputTbl1;

DROP TABLE T1;


The above query gives wrong results

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4073) Make partition by optional in over clause

2013-03-01 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4073:
---

Attachment: HIVE-4073-2.patch

Attached patch seems to work well. All the ptf tests pass and the query 
discussed above fails.

 Make partition by optional in over clause
 -

 Key: HIVE-4073
 URL: https://issues.apache.org/jira/browse/HIVE-4073
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Brock Noland
 Attachments: HIVE-4073-0.patch, HIVE-4073-1.patch, HIVE-4073-2.patch


 select s, sum( i ) over() from tt; should work. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4073) Make partition by optional in over clause

2013-03-01 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4073:
---

Status: Patch Available  (was: Open)

 Make partition by optional in over clause
 -

 Key: HIVE-4073
 URL: https://issues.apache.org/jira/browse/HIVE-4073
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Brock Noland
 Attachments: HIVE-4073-0.patch, HIVE-4073-1.patch, HIVE-4073-2.patch


 select s, sum( i ) over() from tt; should work. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 1994 - Fixed

2013-03-01 Thread Apache Jenkins Server
Changes for Build #1992

Changes for Build #1993

Changes for Build #1994



All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1994)

Status: Fixed

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1994/ to 
view the results.

[jira] [Updated] (HIVE-3874) Create a new Optimized Row Columnar file format for Hive

2013-03-01 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-3874:


Status: Patch Available  (was: Open)

Pamela,
  Yeah, that probably makes sense. I'll file the follow up jiras.

 Create a new Optimized Row Columnar file format for Hive
 

 Key: HIVE-3874
 URL: https://issues.apache.org/jira/browse/HIVE-3874
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: hive.3874.2.patch, HIVE-3874.D8529.1.patch, 
 HIVE-3874.D8529.2.patch, HIVE-3874.D8529.3.patch, HIVE-3874.D8529.4.patch, 
 HIVE-3874.D8871.1.patch, OrcFileIntro.pptx, orc.tgz


 There are several limitations of the current RC File format that I'd like to 
 address by creating a new format:
 * each column value is stored as a binary blob, which means:
 ** the entire column value must be read, decompressed, and deserialized
 ** the file format can't use smarter type-specific compression
 ** push down filters can't be evaluated
 * the start of each row group needs to be found by scanning
 * user metadata can only be added to the file when the file is created
 * the file doesn't store the number of rows per a file or row group
 * there is no mechanism for seeking to a particular row number, which is 
 required for external indexes.
 * there is no mechanism for storing light weight indexes within the file to 
 enable push-down filters to skip entire row groups.
 * the type of the rows aren't stored in the file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4097) ORC file doesn't properly interpret empty hive.io.file.readcolumn.ids

2013-03-01 Thread Owen O'Malley (JIRA)
Owen O'Malley created HIVE-4097:
---

 Summary: ORC file doesn't properly interpret empty 
hive.io.file.readcolumn.ids
 Key: HIVE-4097
 URL: https://issues.apache.org/jira/browse/HIVE-4097
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Hive assumes that an empty string in hive.io.file.readcolumn.ids means all 
columns. The ORC reader currently assumes it means no columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4098) OrcInputFormat assumes Hive always calls createValue

2013-03-01 Thread Owen O'Malley (JIRA)
Owen O'Malley created HIVE-4098:
---

 Summary: OrcInputFormat assumes Hive always calls createValue
 Key: HIVE-4098
 URL: https://issues.apache.org/jira/browse/HIVE-4098
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Hive's HiveContextAwareRecordReader doesn't create a new value for each 
InputFormat and instead reuses the same row between input formats. That causes 
the first record of second (and third, etc.) partition to be dropped and 
replaced with the last row of the previous partition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4099) Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

2013-03-01 Thread Zafar Gilani (JIRA)
Zafar Gilani created HIVE-4099:
--

 Summary: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask
 Key: HIVE-4099
 URL: https://issues.apache.org/jira/browse/HIVE-4099
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.1
 Environment: GNU/Linux x86_64, kernel 2.6.32-131.0.15.e16.x86_64, 16 
cores, 48 GB main memory, 16 mappers, 8 reducers, mapred.java.child.opts set to 
2g.
Reporter: Zafar Gilani


Join query fails with Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask.

hive.log:
ERROR exec.MapredLocalTask (SessionState.java:printError(365))
ERROR ql.Driver (SessionState.java:printError(365))

Select and insert queries work fine. Simplest of join fails.

Data-set size: Two tables being joined, have 27k records each, each record 
having three fields.

Already tried and failed:
- Add contrib jar to the hive classpath
- Set Hadoop mapred.child.java.opts to 2 to 8g of memory
- Set Hive mapred.child.java.opts to 2 to 8g of memory
- Set hive.auto.convert.join to true (regular join to mapjoin)
- Set hive.optimize.skewjoin to true (handle skewness in data)
- Set hive.mapjoin.maxsize to 100 (small table rows, both tables have 27k 
rows)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4015) Add ORC file to the grammar as a file format

2013-03-01 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590722#comment-13590722
 ] 

Owen O'Malley commented on HIVE-4015:
-

Gunther, this looks good. I'd suggest removing the code that lets you override 
the serde, since with ORC you really don't want to do that.

 Add ORC file to the grammar as a file format
 

 Key: HIVE-4015
 URL: https://issues.apache.org/jira/browse/HIVE-4015
 Project: Hive
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Gunther Hagleitner
 Attachments: HIVE-4015.1.patch, HIVE-4015.2.patch, HIVE-4015.3.patch


 It would be much more convenient for users if we enable them to use ORC as a 
 file format in the HQL grammar. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4097) ORC file doesn't properly interpret empty hive.io.file.readcolumn.ids

2013-03-01 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4097:
--

Attachment: HIVE-4097.D9015.1.patch

omalley requested code review of HIVE-4097 [jira] ORC file doesn't properly 
interpret empty hive.io.file.readcolumn.ids.

Reviewers: JIRA

HIVE-4097

Hive assumes that an empty string in hive.io.file.readcolumn.ids means all 
columns. The ORC reader currently assumes it means no columns.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D9015

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/21861/

To: JIRA, omalley


 ORC file doesn't properly interpret empty hive.io.file.readcolumn.ids
 -

 Key: HIVE-4097
 URL: https://issues.apache.org/jira/browse/HIVE-4097
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-4097.D9015.1.patch


 Hive assumes that an empty string in hive.io.file.readcolumn.ids means all 
 columns. The ORC reader currently assumes it means no columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3874) Create a new Optimized Row Columnar file format for Hive

2013-03-01 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590757#comment-13590757
 ] 

Kevin Wilfong commented on HIVE-3874:
-

Thanks Pam and Owen.

+1 again

 Create a new Optimized Row Columnar file format for Hive
 

 Key: HIVE-3874
 URL: https://issues.apache.org/jira/browse/HIVE-3874
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: hive.3874.2.patch, HIVE-3874.D8529.1.patch, 
 HIVE-3874.D8529.2.patch, HIVE-3874.D8529.3.patch, HIVE-3874.D8529.4.patch, 
 HIVE-3874.D8871.1.patch, OrcFileIntro.pptx, orc.tgz


 There are several limitations of the current RC File format that I'd like to 
 address by creating a new format:
 * each column value is stored as a binary blob, which means:
 ** the entire column value must be read, decompressed, and deserialized
 ** the file format can't use smarter type-specific compression
 ** push down filters can't be evaluated
 * the start of each row group needs to be found by scanning
 * user metadata can only be added to the file when the file is created
 * the file doesn't store the number of rows per a file or row group
 * there is no mechanism for seeking to a particular row number, which is 
 required for external indexes.
 * there is no mechanism for storing light weight indexes within the file to 
 enable push-down filters to skip entire row groups.
 * the type of the rows aren't stored in the file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4045) Modify PreDropPartitionEvent to pass Table parameter

2013-03-01 Thread Li Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590769#comment-13590769
 ] 

Li Yang commented on HIVE-4045:
---

Namit,

Can you please review this change?

Thanks,
Li

 Modify PreDropPartitionEvent to pass Table parameter
 

 Key: HIVE-4045
 URL: https://issues.apache.org/jira/browse/HIVE-4045
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Li Yang
Assignee: Li Yang
Priority: Minor

 MetaStorePreEventListener which implements onEvent(PreEventContext context) 
 sometimes needs to access Table properties when PreDropPartitionEvent is 
 listened to.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HIVE-4045) Modify PreDropPartitionEvent to pass Table parameter

2013-03-01 Thread Li Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-4045 started by Li Yang.

 Modify PreDropPartitionEvent to pass Table parameter
 

 Key: HIVE-4045
 URL: https://issues.apache.org/jira/browse/HIVE-4045
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Li Yang
Assignee: Li Yang
Priority: Minor

 MetaStorePreEventListener which implements onEvent(PreEventContext context) 
 sometimes needs to access Table properties when PreDropPartitionEvent is 
 listened to.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4045) Modify PreDropPartitionEvent to pass Table parameter

2013-03-01 Thread Li Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Yang updated HIVE-4045:
--

Status: Patch Available  (was: In Progress)

 Modify PreDropPartitionEvent to pass Table parameter
 

 Key: HIVE-4045
 URL: https://issues.apache.org/jira/browse/HIVE-4045
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Li Yang
Assignee: Li Yang
Priority: Minor

 MetaStorePreEventListener which implements onEvent(PreEventContext context) 
 sometimes needs to access Table properties when PreDropPartitionEvent is 
 listened to.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4100) Improve regex_replace UDF to allow non-ascii characters

2013-03-01 Thread Mark Grover (JIRA)
Mark Grover created HIVE-4100:
-

 Summary: Improve regex_replace UDF to allow non-ascii characters
 Key: HIVE-4100
 URL: https://issues.apache.org/jira/browse/HIVE-4100
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.10.0
Reporter: Mark Grover
Assignee: Mark Grover
 Fix For: 0.11.0


There have a been a few email threads on the user mailing list regarding 
regex_replace UDF not supporting non-ASCII characters. We should validate that 
and improve the UDF to allow it. Translate UDF will be a good reference since 
it does that by using code points instead of characters

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4101) Partition By field must be in select field list

2013-03-01 Thread Brock Noland (JIRA)
Brock Noland created HIVE-4101:
--

 Summary: Partition By field must be in select field list
 Key: HIVE-4101
 URL: https://issues.apache.org/jira/browse/HIVE-4101
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland


This following query:
{noformat}
SELECT year, quarter, sales,avg(sales) OVER (PARTITION BY department, year)
FROM quarterly_sales
WHERE department = 'Appliances';
{noformat}

fails as below. If department is moved to the select field list it passes.

{noformat}
Diagnostic Messages for this Task:java.lang.RuntimeException: Error in 
configuring object
 at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
 at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
 at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
 ... 9 more
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
 at 
org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:160)
 ... 14 more
Caused by: java.lang.RuntimeException: cannot find field _col0 from [0:_col1, 
1:_col2, 2:_col3]
 at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
 at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143)
 at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
 at 
org.apache.hadoop.hive.ql.exec.PTFOperator.setupKeysWrapper(PTFOperator.java:193)
 at 
org.apache.hadoop.hive.ql.exec.PTFOperator.initializeOp(PTFOperator.java:100)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
 at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:409)
 at 
org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
 at 
org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:152)
 ... 14 more
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4101) Partition By field must be in select field list

2013-03-01 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590789#comment-13590789
 ] 

Ashutosh Chauhan commented on HIVE-4101:


This is same as HIVE-4085. It is caused by HIVE-4035, before which such a query 
used to succeed.

 Partition By field must be in select field list
 ---

 Key: HIVE-4101
 URL: https://issues.apache.org/jira/browse/HIVE-4101
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland

 This following query:
 {noformat}
 SELECT year, quarter, sales,avg(sales) OVER (PARTITION BY department, year)
 FROM quarterly_sales
 WHERE department = 'Appliances';
 {noformat}
 fails as below. If department is moved to the select field list it passes.
 {noformat}
 Diagnostic Messages for this Task:java.lang.RuntimeException: Error in 
 configuring object
  at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
  at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
  at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
  at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
  at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:396)
  at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
  at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.reflect.InvocationTargetException
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
  ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
  at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:160)
  ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field _col0 from [0:_col1, 
 1:_col2, 2:_col3]
  at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
  at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143)
  at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
  at 
 org.apache.hadoop.hive.ql.exec.PTFOperator.setupKeysWrapper(PTFOperator.java:193)
  at 
 org.apache.hadoop.hive.ql.exec.PTFOperator.initializeOp(PTFOperator.java:100)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
  at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:409)
  at 
 org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
  at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:152)
  ... 14 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4101) Partition By field must be in select field list

2013-03-01 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland resolved HIVE-4101.


Resolution: Duplicate

Thanks! Resolving as dup..

 Partition By field must be in select field list
 ---

 Key: HIVE-4101
 URL: https://issues.apache.org/jira/browse/HIVE-4101
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland

 This following query:
 {noformat}
 SELECT year, quarter, sales,avg(sales) OVER (PARTITION BY department, year)
 FROM quarterly_sales
 WHERE department = 'Appliances';
 {noformat}
 fails as below. If department is moved to the select field list it passes.
 {noformat}
 Diagnostic Messages for this Task:java.lang.RuntimeException: Error in 
 configuring object
  at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
  at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
  at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
  at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
  at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:396)
  at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
  at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.reflect.InvocationTargetException
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
  ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
  at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:160)
  ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field _col0 from [0:_col1, 
 1:_col2, 2:_col3]
  at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
  at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143)
  at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
  at 
 org.apache.hadoop.hive.ql.exec.PTFOperator.setupKeysWrapper(PTFOperator.java:193)
  at 
 org.apache.hadoop.hive.ql.exec.PTFOperator.initializeOp(PTFOperator.java:100)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
  at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:409)
  at 
 org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
  at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:152)
  ... 14 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-3985) Update new UDAFs introduced for Windowing to work with new Decimal Type

2013-03-01 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland reassigned HIVE-3985:
--

Assignee: Brock Noland

 Update new UDAFs introduced for Windowing to work with new Decimal Type
 ---

 Key: HIVE-3985
 URL: https://issues.apache.org/jira/browse/HIVE-3985
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Brock Noland



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4100) Improve regex_replace UDF to allow non-ascii characters

2013-03-01 Thread thattommyhall (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590803#comment-13590803
 ] 

thattommyhall commented on HIVE-4100:
-

The particular case I had was
regexp_replace(some_column,[^\\u-\\u],\ufffd)
does not work, whereas 
regexp_replace(some_column,[^\\u-\\u],�)
does.

So we need a way to specify unicode chars in the replace string.

 Improve regex_replace UDF to allow non-ascii characters
 ---

 Key: HIVE-4100
 URL: https://issues.apache.org/jira/browse/HIVE-4100
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.10.0
Reporter: Mark Grover
Assignee: Mark Grover
 Fix For: 0.11.0


 There have a been a few email threads on the user mailing list regarding 
 regex_replace UDF not supporting non-ASCII characters. We should validate 
 that and improve the UDF to allow it. Translate UDF will be a good reference 
 since it does that by using code points instead of characters

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4093) Remove sprintf from PTFTranslator and use String.format()

2013-03-01 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4093:
---

Status: Patch Available  (was: Open)

 Remove sprintf from PTFTranslator and use String.format()
 -

 Key: HIVE-4093
 URL: https://issues.apache.org/jira/browse/HIVE-4093
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Attachments: HIVE-4093-0.patch, HIVE-4093-1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4045) Modify PreDropPartitionEvent to pass Table parameter

2013-03-01 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4045:


Status: Open  (was: Patch Available)

 Modify PreDropPartitionEvent to pass Table parameter
 

 Key: HIVE-4045
 URL: https://issues.apache.org/jira/browse/HIVE-4045
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Li Yang
Assignee: Li Yang
Priority: Minor

 MetaStorePreEventListener which implements onEvent(PreEventContext context) 
 sometimes needs to access Table properties when PreDropPartitionEvent is 
 listened to.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4045) Modify PreDropPartitionEvent to pass Table parameter

2013-03-01 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590867#comment-13590867
 ] 

Kevin Wilfong commented on HIVE-4045:
-

Comments on Phabricator

 Modify PreDropPartitionEvent to pass Table parameter
 

 Key: HIVE-4045
 URL: https://issues.apache.org/jira/browse/HIVE-4045
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Li Yang
Assignee: Li Yang
Priority: Minor

 MetaStorePreEventListener which implements onEvent(PreEventContext context) 
 sometimes needs to access Table properties when PreDropPartitionEvent is 
 listened to.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4102) Can't drop table with Postgresql metastore

2013-03-01 Thread sekine coulibaly (JIRA)
sekine coulibaly created HIVE-4102:
--

 Summary: Can't drop table with Postgresql metastore
 Key: HIVE-4102
 URL: https://issues.apache.org/jira/browse/HIVE-4102
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.10.0
 Environment: Centos 6.3
CDH 4.2.0
Reporter: sekine coulibaly


Setup a fresh hive install, create a table pointing to an HDFS file.
Then, when trying to drop that table, the CLI hangs for a while and then 
displays :

hive drop table log;
FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timed out
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask

Trying another time :

hive drop table log;
FAILED: SemanticException [Error 10001]: Table not found log

getting tables list :

hive show tables;
FAILED: Error in metadata: MetaException(message:Got exception: 
org.apache.thrift.transport.TTransportException 
java.net.SocketTimeoutException: Read timed out)
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask
hive 

For this last query, the Postgresql logs showthe following :
LOG:  connection received: host=127.0.0.1 port=49717
LOG:  connection authorized: user=hiveuser database=metastore
LOG:  execute unnamed: SHOW TRANSACTION ISOLATION LEVEL
LOG:  execute S_1: BEGIN
LOG:  execute unnamed: SELECT 
'org.apache.hadoop.hive.metastore.model.MDatabase' AS 
NUCLEUS_TYPE,THIS.DESC,THIS.DB_LOCATION_URI,THIS.NAME,THIS.DB_ID
 FROM DBS THIS WHERE THIS.NAME = $1
DETAIL:  parameters: $1 = 'default'
LOG:  execute unnamed: SELECT A0.PARAM_KEY,A0.PARAM_VALUE FROM 
DATABASE_PARAMS A0 WHERE A0.DB_ID = $1 AND A0.PARAM_KEY IS NOT NULL
DETAIL:  parameters: $1 = '1'
LOG:  execute S_2: COMMIT
LOG:  execute unnamed: SHOW TRANSACTION ISOLATION LEVEL
LOG:  execute S_1: BEGIN
WARNING:  nonstandard use of \\ in a string literal at character 234
HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.

(standard_conforming_strings = off).

Would this help ? 
http://mapredit.blogspot.fr/2012/12/hive-drop-table-hangs-postgres-metastore.html




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 1995 - Failure

2013-03-01 Thread Apache Jenkins Server
Changes for Build #1995



No tests ran.

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1995)

Status: Failure

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1995/ to 
view the results.

[jira] [Updated] (HIVE-4097) ORC file doesn't properly interpret empty hive.io.file.readcolumn.ids

2013-03-01 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4097:


Status: Patch Available  (was: Open)

This patch fixes the problem and adds a test case to ensure that the empty 
string is correctly handled.

 ORC file doesn't properly interpret empty hive.io.file.readcolumn.ids
 -

 Key: HIVE-4097
 URL: https://issues.apache.org/jira/browse/HIVE-4097
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-4097.D9015.1.patch


 Hive assumes that an empty string in hive.io.file.readcolumn.ids means all 
 columns. The ORC reader currently assumes it means no columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4103) Remove System.gc() call from the map-join local-task loop

2013-03-01 Thread Gopal V (JIRA)
Gopal V created HIVE-4103:
-

 Summary: Remove System.gc() call from the map-join local-task loop
 Key: HIVE-4103
 URL: https://issues.apache.org/jira/browse/HIVE-4103
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V


Hive's HashMapWrapper calls System.gc() twice within the 
HashMapWrapper::isAbort() which produces a significant slow-down during the 
loop.

{code}
2013-03-01 04:54:28 The gc calls took 677 ms
2013-03-01 04:54:28 Processing rows:20  Hashtable size: 19  
Memory usage:   62955432rate:   0.033
2013-03-01 04:54:31 The gc calls took 956 ms
2013-03-01 04:54:31 Processing rows:30  Hashtable size: 29  
Memory usage:   90826656rate:   0.048
2013-03-01 04:54:33 The gc calls took 967 ms
2013-03-01 04:54:33 Processing rows:384160  Hashtable size: 384160  
Memory usage:   114412712   rate:   0.06
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4098) OrcInputFormat assumes Hive always calls createValue

2013-03-01 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4098:
--

Attachment: HIVE-4098.D9021.1.patch

omalley requested code review of HIVE-4098 [jira] OrcInputFormat assumes Hive 
always calls createValue.

Reviewers: JIRA

hive-4098 remove assumption that only an inputformat's createValue() is used in 
the next() calls.

Hive's HiveContextAwareRecordReader doesn't create a new value for each 
InputFormat and instead reuses the same row between input formats. That causes 
the first record of second (and third, etc.) partition to be dropped and 
replaced with the last row of the previous partition.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D9021

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/21897/

To: JIRA, omalley


 OrcInputFormat assumes Hive always calls createValue
 

 Key: HIVE-4098
 URL: https://issues.apache.org/jira/browse/HIVE-4098
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-4098.D9021.1.patch


 Hive's HiveContextAwareRecordReader doesn't create a new value for each 
 InputFormat and instead reuses the same row between input formats. That 
 causes the first record of second (and third, etc.) partition to be dropped 
 and replaced with the last row of the previous partition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4103) Remove System.gc() call from the map-join local-task loop

2013-03-01 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4103:
--

Priority: Minor  (was: Major)

 Remove System.gc() call from the map-join local-task loop
 -

 Key: HIVE-4103
 URL: https://issues.apache.org/jira/browse/HIVE-4103
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Priority: Minor

 Hive's HashMapWrapper calls System.gc() twice within the 
 HashMapWrapper::isAbort() which produces a significant slow-down during the 
 loop.
 {code}
 2013-03-01 04:54:28 The gc calls took 677 ms
 2013-03-01 04:54:28 Processing rows:20  Hashtable size: 
 19  Memory usage:   62955432rate:   0.033
 2013-03-01 04:54:31 The gc calls took 956 ms
 2013-03-01 04:54:31 Processing rows:30  Hashtable size: 
 29  Memory usage:   90826656rate:   0.048
 2013-03-01 04:54:33 The gc calls took 967 ms
 2013-03-01 04:54:33 Processing rows:384160  Hashtable size: 
 384160  Memory usage:   114412712   rate:   0.06
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4098) OrcInputFormat assumes Hive always calls createValue

2013-03-01 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4098:


Status: Patch Available  (was: Open)

The patch removes the assumption of a dedicated row for each RecordReader.

 OrcInputFormat assumes Hive always calls createValue
 

 Key: HIVE-4098
 URL: https://issues.apache.org/jira/browse/HIVE-4098
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-4098.D9021.1.patch


 Hive's HiveContextAwareRecordReader doesn't create a new value for each 
 InputFormat and instead reuses the same row between input formats. That 
 causes the first record of second (and third, etc.) partition to be dropped 
 and replaced with the last row of the previous partition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4071) Map-join outer join produces incorrect results.

2013-03-01 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4071:
-

Attachment: HIVE-4071_2.patch

Updated to remove the dead code. Still needs work to address tests comment.

 Map-join outer join produces incorrect results.
 ---

 Key: HIVE-4071
 URL: https://issues.apache.org/jira/browse/HIVE-4071
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4071_2.patch, HIVE-4071.patch


 For example, if one sets the size of noConditionalTask.size to 10 with 
 corresponding auto join configurations set to true in auto_join28.q instead 
 of the current smalltable.filesize configuration, we will observe different 
 results if a select query is run. (The test only has explain statements at 
 present).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2655) Ability to define functions in HQL

2013-03-01 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590981#comment-13590981
 ] 

Carl Steinbach commented on HIVE-2655:
--

@Brock: Can you open a new phabricator review request for your updated version 
of the patch? Thanks.

 Ability to define functions in HQL
 --

 Key: HIVE-2655
 URL: https://issues.apache.org/jira/browse/HIVE-2655
 Project: Hive
  Issue Type: New Feature
  Components: SQL
Reporter: Jonathan Perlow
Assignee: Brock Noland
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2655.D915.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2655.D915.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2655.D915.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2655.D915.4.patch, HIVE-2655-9.patch


 Ability to create functions in HQL as a substitute for creating them in Java.
 Jonathan Chang requested I create this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4103) Remove System.gc() call from the map-join local-task loop

2013-03-01 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4103:
--

Attachment: HIVE-4103.patch

Remove the thread-stopping System.gc() calls from isAbort()

 Remove System.gc() call from the map-join local-task loop
 -

 Key: HIVE-4103
 URL: https://issues.apache.org/jira/browse/HIVE-4103
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Priority: Minor
 Attachments: HIVE-4103.patch


 Hive's HashMapWrapper calls System.gc() twice within the 
 HashMapWrapper::isAbort() which produces a significant slow-down during the 
 loop.
 {code}
 2013-03-01 04:54:28 The gc calls took 677 ms
 2013-03-01 04:54:28 Processing rows:20  Hashtable size: 
 19  Memory usage:   62955432rate:   0.033
 2013-03-01 04:54:31 The gc calls took 956 ms
 2013-03-01 04:54:31 Processing rows:30  Hashtable size: 
 29  Memory usage:   90826656rate:   0.048
 2013-03-01 04:54:33 The gc calls took 967 ms
 2013-03-01 04:54:33 Processing rows:384160  Hashtable size: 
 384160  Memory usage:   114412712   rate:   0.06
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2655) Ability to define functions in HQL

2013-03-01 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590986#comment-13590986
 ] 

Brock Noland commented on HIVE-2655:


[~cwsteinbach] I did here https://reviews.facebook.net/D8673 and setup as a 
weblink not sure why it's not updating the JIRA automatically.

 Ability to define functions in HQL
 --

 Key: HIVE-2655
 URL: https://issues.apache.org/jira/browse/HIVE-2655
 Project: Hive
  Issue Type: New Feature
  Components: SQL
Reporter: Jonathan Perlow
Assignee: Brock Noland
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2655.D915.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2655.D915.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2655.D915.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2655.D915.4.patch, HIVE-2655-9.patch


 Ability to create functions in HQL as a substitute for creating them in Java.
 Jonathan Chang requested I create this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4103) Remove System.gc() call from the map-join local-task loop

2013-03-01 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590988#comment-13590988
 ] 

Gopal V commented on HIVE-4103:
---

On a run, the difference was 

{code}
2013-03-01 04:57:21 Upload 1 File to: 
file:/tmp/root/hive_2013-03-01_16-56-53_785_1192800933446838868/-local-10002/HashTable-Stage-1/MapJoin-demographics-01--.hashtable
 File size: 18426794
2013-03-01 04:57:21 End of local task; Time Taken: 22.426 sec.
{code}

versus, after-fix

{code}
2013-03-01 04:56:26 Upload 1 File to: 
file:/tmp/root/hive_2013-03-01_16-56-01_539_5116929752955084952/-local-10002/HashTable-Stage-1/MapJoin-demographics-01--.hashtable
 File size: 18426794
2013-03-01 04:56:26 End of local task; Time Taken: 19.874 sec.
{code}

 Remove System.gc() call from the map-join local-task loop
 -

 Key: HIVE-4103
 URL: https://issues.apache.org/jira/browse/HIVE-4103
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Priority: Minor
 Attachments: HIVE-4103.patch


 Hive's HashMapWrapper calls System.gc() twice within the 
 HashMapWrapper::isAbort() which produces a significant slow-down during the 
 loop.
 {code}
 2013-03-01 04:54:28 The gc calls took 677 ms
 2013-03-01 04:54:28 Processing rows:20  Hashtable size: 
 19  Memory usage:   62955432rate:   0.033
 2013-03-01 04:54:31 The gc calls took 956 ms
 2013-03-01 04:54:31 Processing rows:30  Hashtable size: 
 29  Memory usage:   90826656rate:   0.048
 2013-03-01 04:54:33 The gc calls took 967 ms
 2013-03-01 04:54:33 Processing rows:384160  Hashtable size: 
 384160  Memory usage:   114412712   rate:   0.06
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4103) Remove System.gc() call from the map-join local-task loop

2013-03-01 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4103:
--

Release Note: Remove System.gc() calls from HashMapWrapper::isAbort() to 
avoid slow-downs during local task of the map-join
  Status: Patch Available  (was: Open)

 Remove System.gc() call from the map-join local-task loop
 -

 Key: HIVE-4103
 URL: https://issues.apache.org/jira/browse/HIVE-4103
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Priority: Minor
 Attachments: HIVE-4103.patch


 Hive's HashMapWrapper calls System.gc() twice within the 
 HashMapWrapper::isAbort() which produces a significant slow-down during the 
 loop.
 {code}
 2013-03-01 04:54:28 The gc calls took 677 ms
 2013-03-01 04:54:28 Processing rows:20  Hashtable size: 
 19  Memory usage:   62955432rate:   0.033
 2013-03-01 04:54:31 The gc calls took 956 ms
 2013-03-01 04:54:31 Processing rows:30  Hashtable size: 
 29  Memory usage:   90826656rate:   0.048
 2013-03-01 04:54:33 The gc calls took 967 ms
 2013-03-01 04:54:33 Processing rows:384160  Hashtable size: 
 384160  Memory usage:   114412712   rate:   0.06
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4103) Remove System.gc() call from the map-join local-task loop

2013-03-01 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-4103:
-

Assignee: Gopal V

 Remove System.gc() call from the map-join local-task loop
 -

 Key: HIVE-4103
 URL: https://issues.apache.org/jira/browse/HIVE-4103
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Attachments: HIVE-4103.patch


 Hive's HashMapWrapper calls System.gc() twice within the 
 HashMapWrapper::isAbort() which produces a significant slow-down during the 
 loop.
 {code}
 2013-03-01 04:54:28 The gc calls took 677 ms
 2013-03-01 04:54:28 Processing rows:20  Hashtable size: 
 19  Memory usage:   62955432rate:   0.033
 2013-03-01 04:54:31 The gc calls took 956 ms
 2013-03-01 04:54:31 Processing rows:30  Hashtable size: 
 29  Memory usage:   90826656rate:   0.048
 2013-03-01 04:54:33 The gc calls took 967 ms
 2013-03-01 04:54:33 Processing rows:384160  Hashtable size: 
 384160  Memory usage:   114412712   rate:   0.06
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4104) Hive localtask does not buffer disk-writes or reads

2013-03-01 Thread Gopal V (JIRA)
Gopal V created HIVE-4104:
-

 Summary: Hive localtask does not buffer disk-writes or reads
 Key: HIVE-4104
 URL: https://issues.apache.org/jira/browse/HIVE-4104
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor


Hive's HashMapWrapper does not use any buffering in its File I/O, but operates 
sequentially for writes  reads.

The strace logs show clearly that

{code}
9495  write(222, x, 1)= 1
9495  write(222, sq\0~\0\5, 6)= 6
9495  write(222, w\25, 2) = 2
9495  write(222, \0\0\0\1\0\0\0\1\0\0\0\2\0\0\0\5\3\1M\1S, 21) = 21
9495  write(222, x, 1)= 1
9495  write(222, sq\0~\0\2, 6)= 6
9495  write(222, w\t, 2)  = 2
9495  write(222, \0\0\0\5\1\215\r\325v, 9) = 9
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4104) Hive localtask does not buffer disk-writes or reads

2013-03-01 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591008#comment-13591008
 ] 

Gopal V commented on HIVE-4104:
---

Before

{code}
2013-03-01 05:15:13 Dump the hashtable into file: 
file:/tmp/root/hive_2013-03-01_17-14-59_468_442960319525994949/-local-10002/HashTable-Stage-1/MapJoin-customer_demographics-01--.hashtable
2013-03-01 05:15:27 Upload 1 File to: 
file:/tmp/root/hive_2013-03-01_17-14-59_468_442960319525994949/-local-10002/HashTable-Stage-1/MapJoin-customer_demographics-01--.hashtable
 File size: 18426794
2013-03-01 05:15:27 End of local task; Time Taken: 22.314 sec.
{code}

After

{code}
2013-03-01 05:15:53 Dump the hashtable into file: 
file:/tmp/root/hive_2013-03-01_17-15-39_668_1531738824783900468/-local-10002/HashTable-Stage-1/MapJoin-demographics-01--.hashtable
2013-03-01 05:15:54 Upload 1 File to: 
file:/tmp/root/hive_2013-03-01_17-15-39_668_1531738824783900468/-local-10002/HashTable-Stage-1/MapJoin-demographics-01--.hashtable
 File size: 18426794
2013-03-01 05:15:54 End of local task; Time Taken: 9.601 sec.
{code}

Savings are found on the map-side read as well.

Before

{code}
Job 0: Map: 4   Cumulative CPU: 64.79 sec   HDFS Read: 300156 HDFS Write: 1682 
SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 4 seconds 790 msec

Time taken: 56.385 seconds, Fetched: 100 row(s)
{code}

After

{code}
Job 0: Map: 4   Cumulative CPU: 26.95 sec   HDFS Read: 300156 HDFS Write: 1682 
SUCCESS
Total MapReduce CPU Time Spent: 26 seconds 950 msec

Time taken: 38.173 seconds, Fetched: 100 row(s)
{code}

 Hive localtask does not buffer disk-writes or reads
 ---

 Key: HIVE-4104
 URL: https://issues.apache.org/jira/browse/HIVE-4104
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor

 Hive's HashMapWrapper does not use any buffering in its File I/O, but 
 operates sequentially for writes  reads.
 The strace logs show clearly that
 {code}
 9495  write(222, x, 1)= 1
 9495  write(222, sq\0~\0\5, 6)= 6
 9495  write(222, w\25, 2) = 2
 9495  write(222, \0\0\0\1\0\0\0\1\0\0\0\2\0\0\0\5\3\1M\1S, 21) = 21
 9495  write(222, x, 1)= 1
 9495  write(222, sq\0~\0\2, 6)= 6
 9495  write(222, w\t, 2)  = 2
 9495  write(222, \0\0\0\5\1\215\r\325v, 9) = 9
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4104) Hive localtask does not buffer disk-writes or reads

2013-03-01 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591010#comment-13591010
 ] 

Brock Noland commented on HIVE-4104:


Nice find!

 Hive localtask does not buffer disk-writes or reads
 ---

 Key: HIVE-4104
 URL: https://issues.apache.org/jira/browse/HIVE-4104
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Attachments: HIVE-4104.patch


 Hive's HashMapWrapper does not use any buffering in its File I/O, but 
 operates sequentially for writes  reads.
 The strace logs show clearly that
 {code}
 9495  write(222, x, 1)= 1
 9495  write(222, sq\0~\0\5, 6)= 6
 9495  write(222, w\25, 2) = 2
 9495  write(222, \0\0\0\1\0\0\0\1\0\0\0\2\0\0\0\5\3\1M\1S, 21) = 21
 9495  write(222, x, 1)= 1
 9495  write(222, sq\0~\0\2, 6)= 6
 9495  write(222, w\t, 2)  = 2
 9495  write(222, \0\0\0\5\1\215\r\325v, 9) = 9
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4104) Hive localtask does not buffer disk-writes or reads

2013-03-01 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4104:
--

Attachment: HIVE-4104.patch

Buffer I/O for HashMapWrapper

 Hive localtask does not buffer disk-writes or reads
 ---

 Key: HIVE-4104
 URL: https://issues.apache.org/jira/browse/HIVE-4104
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Attachments: HIVE-4104.patch


 Hive's HashMapWrapper does not use any buffering in its File I/O, but 
 operates sequentially for writes  reads.
 The strace logs show clearly that
 {code}
 9495  write(222, x, 1)= 1
 9495  write(222, sq\0~\0\5, 6)= 6
 9495  write(222, w\25, 2) = 2
 9495  write(222, \0\0\0\1\0\0\0\1\0\0\0\2\0\0\0\5\3\1M\1S, 21) = 21
 9495  write(222, x, 1)= 1
 9495  write(222, sq\0~\0\2, 6)= 6
 9495  write(222, w\t, 2)  = 2
 9495  write(222, \0\0\0\5\1\215\r\325v, 9) = 9
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4104) Hive localtask does not buffer disk-writes or reads

2013-03-01 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4104:
--

Release Note: Buffer I/O on HashMapWrapper to speed up write/read ops
  Status: Patch Available  (was: Open)

 Hive localtask does not buffer disk-writes or reads
 ---

 Key: HIVE-4104
 URL: https://issues.apache.org/jira/browse/HIVE-4104
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Attachments: HIVE-4104.patch


 Hive's HashMapWrapper does not use any buffering in its File I/O, but 
 operates sequentially for writes  reads.
 The strace logs show clearly that
 {code}
 9495  write(222, x, 1)= 1
 9495  write(222, sq\0~\0\5, 6)= 6
 9495  write(222, w\25, 2) = 2
 9495  write(222, \0\0\0\1\0\0\0\1\0\0\0\2\0\0\0\5\3\1M\1S, 21) = 21
 9495  write(222, x, 1)= 1
 9495  write(222, sq\0~\0\2, 6)= 6
 9495  write(222, w\t, 2)  = 2
 9495  write(222, \0\0\0\5\1\215\r\325v, 9) = 9
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3963) Allow Hive to connect to RDBMS

2013-03-01 Thread Maxime LANCIAUX (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxime LANCIAUX updated HIVE-3963:
--

Description: 
I am thinking about something like :

SELECT jdbcload('driver','url','user','password','sql') FROM dual;

There is already a JIRA https://issues.apache.org/jira/browse/HIVE-1555 for 
JDBCStorageHandler

  was:
I am thinking about something like :

CREATE JDBCEXTERNAL TABLE (
 col1 int,
 col2 string
)
TBLPROPERTIES ...

and/or

SELECT jdbcload('driver','url','user','password','sql') FROM dual;


 Allow Hive to connect to RDBMS
 --

 Key: HIVE-3963
 URL: https://issues.apache.org/jira/browse/HIVE-3963
 Project: Hive
  Issue Type: New Feature
  Components: Import/Export, JDBC, SQL, StorageHandler
Affects Versions: 0.10.0, 0.9.1
Reporter: Maxime LANCIAUX

 I am thinking about something like :
 SELECT jdbcload('driver','url','user','password','sql') FROM dual;
 There is already a JIRA https://issues.apache.org/jira/browse/HIVE-1555 for 
 JDBCStorageHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4014) Hive+RCFile is not doing column pruning and reading much more data than necessary

2013-03-01 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591058#comment-13591058
 ] 

Vinod Kumar Vavilapalli commented on HIVE-4014:
---

Okay, I cannot reproduce this on trunk, though I was consistently hitting this 
on hive-0.10. I'll try hive-0.10 again to be sure some other patch fixed this.

[~tamastarjanyi], what version are you using?

 Hive+RCFile is not doing column pruning and reading much more data than 
 necessary
 -

 Key: HIVE-4014
 URL: https://issues.apache.org/jira/browse/HIVE-4014
 Project: Hive
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

 With even simple projection queries, I see that HDFS bytes read counter 
 doesn't show any reduction in the amount of data read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4105) Hive MapJoinOperator unnecessarily deserializes values for all join-keys

2013-03-01 Thread Vinod Kumar Vavilapalli (JIRA)
Vinod Kumar Vavilapalli created HIVE-4105:
-

 Summary: Hive MapJoinOperator unnecessarily deserializes values 
for all join-keys
 Key: HIVE-4105
 URL: https://issues.apache.org/jira/browse/HIVE-4105
 Project: Hive
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli


We can avoid this for inner-joins. Hive does an explicit value de-serialization 
up front so even for those rows which won't emit output. In these cases, we can 
do just with key de-serialization.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4105) Hive MapJoinOperator unnecessarily deserializes values for all join-keys

2013-03-01 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HIVE-4105:
--

Attachment: HIVE-4105-20130301.txt

Here's a patch to avoid value de-serialization where not needed in case of 
inner join.

In my microbenchmark, where I was map-joining a big table, with a small table, 
this brought the task execution time down from 15seconds to 10seconds on about 
3 million records on the big table, the second table being very small and the 
output is small too. Note that you won't see this much of an improvement for 
non-selective inner joins.

If folks are interested, I'll try productionizing the benchmark.

 Hive MapJoinOperator unnecessarily deserializes values for all join-keys
 

 Key: HIVE-4105
 URL: https://issues.apache.org/jira/browse/HIVE-4105
 Project: Hive
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
 Attachments: HIVE-4105-20130301.txt


 We can avoid this for inner-joins. Hive does an explicit value 
 de-serialization up front so even for those rows which won't emit output. In 
 these cases, we can do just with key de-serialization.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4105) Hive MapJoinOperator unnecessarily deserializes values for all join-keys

2013-03-01 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HIVE-4105:
--

Attachment: HIVE-4105-20130301.1.txt

Patch upmerged to the latest trunk.

 Hive MapJoinOperator unnecessarily deserializes values for all join-keys
 

 Key: HIVE-4105
 URL: https://issues.apache.org/jira/browse/HIVE-4105
 Project: Hive
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
 Attachments: HIVE-4105-20130301.1.txt, HIVE-4105-20130301.txt


 We can avoid this for inner-joins. Hive does an explicit value 
 de-serialization up front so even for those rows which won't emit output. In 
 these cases, we can do just with key de-serialization.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3987) Update PTF invocation and windowing grammar

2013-03-01 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3987:
---

Attachment: HIVE-3987.patch

Patch which takes care of second and third bullet points of this jira.

 Update PTF invocation and windowing grammar
 ---

 Key: HIVE-3987
 URL: https://issues.apache.org/jira/browse/HIVE-3987
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
 Attachments: HIVE-3987.patch


 Changes to grammar to make it more Standards based:
 - support Partition  Order style along with Hive specific Distribute/Cluster 
 and Sort in windowing specification.
 - PTF args should come after Input details like in Aster.
 - tbd: do we need to support named parameters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3987) Update PTF invocation and windowing grammar

2013-03-01 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591188#comment-13591188
 ] 

Ashutosh Chauhan commented on HIVE-3987:


https://reviews.facebook.net/D9027

 Update PTF invocation and windowing grammar
 ---

 Key: HIVE-3987
 URL: https://issues.apache.org/jira/browse/HIVE-3987
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
 Attachments: HIVE-3987.patch


 Changes to grammar to make it more Standards based:
 - support Partition  Order style along with Hive specific Distribute/Cluster 
 and Sort in windowing specification.
 - PTF args should come after Input details like in Aster.
 - tbd: do we need to support named parameters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3987) Update PTF invocation and windowing grammar

2013-03-01 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3987:
---

Assignee: Ashutosh Chauhan

 Update PTF invocation and windowing grammar
 ---

 Key: HIVE-3987
 URL: https://issues.apache.org/jira/browse/HIVE-3987
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Ashutosh Chauhan
 Attachments: HIVE-3987.patch


 Changes to grammar to make it more Standards based:
 - support Partition  Order style along with Hive specific Distribute/Cluster 
 and Sort in windowing specification.
 - PTF args should come after Input details like in Aster.
 - tbd: do we need to support named parameters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3987) Update PTF invocation and windowing grammar

2013-03-01 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3987:
---

Status: Patch Available  (was: Open)

 Update PTF invocation and windowing grammar
 ---

 Key: HIVE-3987
 URL: https://issues.apache.org/jira/browse/HIVE-3987
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Ashutosh Chauhan
 Attachments: HIVE-3987.patch


 Changes to grammar to make it more Standards based:
 - support Partition  Order style along with Hive specific Distribute/Cluster 
 and Sort in windowing specification.
 - PTF args should come after Input details like in Aster.
 - tbd: do we need to support named parameters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4106) SMB joins fail in multi-way joins

2013-03-01 Thread Vikram Dixit K (JIRA)
Vikram Dixit K created HIVE-4106:


 Summary: SMB joins fail in multi-way joins
 Key: HIVE-4106
 URL: https://issues.apache.org/jira/browse/HIVE-4106
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K


I see array out of bounds exception in case of multi way smb joins. This is 
related to changes that went in as part of HIVE-3403. This issue has been 
discussed in HIVE-3891.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Jenkins build is back to normal : Hive-0.10.0-SNAPSHOT-h0.20.1 #80

2013-03-01 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/80/



[jira] [Updated] (HIVE-4106) SMB joins fail in multi-way joins

2013-03-01 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4106:
-

Attachment: HIVE-4106.patch

 SMB joins fail in multi-way joins
 -

 Key: HIVE-4106
 URL: https://issues.apache.org/jira/browse/HIVE-4106
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4106.patch


 I see array out of bounds exception in case of multi way smb joins. This is 
 related to changes that went in as part of HIVE-3403. This issue has been 
 discussed in HIVE-3891.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4106) SMB joins fail in multi-way joins

2013-03-01 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4106:
-

Status: Patch Available  (was: Open)

 SMB joins fail in multi-way joins
 -

 Key: HIVE-4106
 URL: https://issues.apache.org/jira/browse/HIVE-4106
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4106.patch


 I see array out of bounds exception in case of multi way smb joins. This is 
 related to changes that went in as part of HIVE-3403. This issue has been 
 discussed in HIVE-3891.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3490) Implement * or a.* for arguments to UDFs

2013-03-01 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3490:


Status: Patch Available  (was: Open)

 Implement * or a.* for arguments to UDFs
 

 Key: HIVE-3490
 URL: https://issues.apache.org/jira/browse/HIVE-3490
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Reporter: Adam Kramer
Assignee: Navis
 Attachments: HIVE-3490.D8889.1.patch, HIVE-3490.D8889.2.patch


 For a random UDF, we should be able to use * or a.* to refer to all of the 
 columns in their natural order. This is not currently implemented.
 I'm reporting this as a bug because it is a manner in which Hive is 
 inconsistent with the SQL spec, and because Hive claims to implement *.
 hive select all_non_null(a.*) from table a where a.ds='2012-09-01';
 FAILED: ParseException line 1:25 mismatched input '*' expecting Identifier 
 near '.' in expression specification

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3996) Correctly enforce the memory limit on the multi-table map-join

2013-03-01 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-3996:
-

Attachment: HIVE-3996_4.patch

 Correctly enforce the memory limit on the multi-table map-join
 --

 Key: HIVE-3996
 URL: https://issues.apache.org/jira/browse/HIVE-3996
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-3996_2.patch, HIVE-3996_3.patch, HIVE-3996_4.patch, 
 HIVE-3996.patch


 Currently with HIVE-3784, the joins are converted to map-joins based on 
 checks of the table size against the config variable: 
 hive.auto.convert.join.noconditionaltask.size. 
 However, the current implementation will also merge multiple mapjoin 
 operators into a single task regardless of whether the sum of the table sizes 
 will exceed the configured value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3490) Implement * or a.* for arguments to UDFs

2013-03-01 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3490:
--

Attachment: HIVE-3490.D8889.2.patch

navis updated the revision HIVE-3490 [jira] Implement * or a.* for arguments 
to UDFs.

  Addressed comments

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D8889

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D8889?vs=28635id=28989#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeColumnListDesc.java
  ql/src/test/queries/clientpositive/allcolref_in_udf.q
  ql/src/test/results/clientpositive/allcolref_in_udf.q.out

To: JIRA, navis
Cc: njain


 Implement * or a.* for arguments to UDFs
 

 Key: HIVE-3490
 URL: https://issues.apache.org/jira/browse/HIVE-3490
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Reporter: Adam Kramer
Assignee: Navis
 Attachments: HIVE-3490.D8889.1.patch, HIVE-3490.D8889.2.patch


 For a random UDF, we should be able to use * or a.* to refer to all of the 
 columns in their natural order. This is not currently implemented.
 I'm reporting this as a bug because it is a manner in which Hive is 
 inconsistent with the SQL spec, and because Hive claims to implement *.
 hive select all_non_null(a.*) from table a where a.ds='2012-09-01';
 FAILED: ParseException line 1:25 mismatched input '*' expecting Identifier 
 near '.' in expression specification

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3490) Implement * or a.* for arguments to UDFs

2013-03-01 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591227#comment-13591227
 ] 

Phabricator commented on HIVE-3490:
---

navis has commented on the revision HIVE-3490 [jira] Implement * or a.* for 
arguments to UDFs.

INLINE COMMENTS
  ql/src/test/queries/clientpositive/allcolref_in_udf.q:4 ok.
  ql/src/test/queries/clientpositive/allcolref_in_udf.q:9 ok.
  ql/src/test/queries/clientpositive/allcolref_in_udf.q:8 it's decided by the 
row schema of prev operator. For joins, it's left most alias to right. Added 
comments.

REVISION DETAIL
  https://reviews.facebook.net/D8889

To: JIRA, navis
Cc: njain


 Implement * or a.* for arguments to UDFs
 

 Key: HIVE-3490
 URL: https://issues.apache.org/jira/browse/HIVE-3490
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Reporter: Adam Kramer
Assignee: Navis
 Attachments: HIVE-3490.D8889.1.patch, HIVE-3490.D8889.2.patch


 For a random UDF, we should be able to use * or a.* to refer to all of the 
 columns in their natural order. This is not currently implemented.
 I'm reporting this as a bug because it is a manner in which Hive is 
 inconsistent with the SQL spec, and because Hive claims to implement *.
 hive select all_non_null(a.*) from table a where a.ds='2012-09-01';
 FAILED: ParseException line 1:25 mismatched input '*' expecting Identifier 
 near '.' in expression specification

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3952) merge map-job followed by map-reduce job

2013-03-01 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HIVE-3952:
--

Attachment: HIVE-3952-20130301.txt

Ran my new test again, passes. This patch can be applied on top of HIVE-4106.

 merge map-job followed by map-reduce job
 

 Key: HIVE-3952
 URL: https://issues.apache.org/jira/browse/HIVE-3952
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Vinod Kumar Vavilapalli
 Attachments: HIVE-3952-20130226.txt, HIVE-3952-20130227.1.txt, 
 HIVE-3952-20130301.txt


 Consider the query like:
 select count(*) FROM
 ( select idOne, idTwo, value FROM
   bigTable   
   JOIN
 
   smallTableOne on (bigTable.idOne = smallTableOne.idOne) 
   
   ) firstjoin 
 
 JOIN  
 
 smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo);
 where smallTableOne and smallTableTwo are smaller than 
 hive.auto.convert.join.noconditionaltask.size and
 hive.auto.convert.join.noconditionaltask is set to true.
 The joins are collapsed into mapjoins, and it leads to a map-only job
 (for the map-joins) followed by a map-reduce job (for the group by).
 Ideally, the map-only job should be merged with the following map-reduce job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4071) Map-join outer join produces incorrect results.

2013-03-01 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4071:
-

Attachment: (was: HIVE-4071_3.patch)

 Map-join outer join produces incorrect results.
 ---

 Key: HIVE-4071
 URL: https://issues.apache.org/jira/browse/HIVE-4071
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4071_2.patch, HIVE-4071_3.patch, HIVE-4071.patch


 For example, if one sets the size of noConditionalTask.size to 10 with 
 corresponding auto join configurations set to true in auto_join28.q instead 
 of the current smalltable.filesize configuration, we will observe different 
 results if a select query is run. (The test only has explain statements at 
 present).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4071) Map-join outer join produces incorrect results.

2013-03-01 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4071:
-

Attachment: HIVE-4071_3.patch

 Map-join outer join produces incorrect results.
 ---

 Key: HIVE-4071
 URL: https://issues.apache.org/jira/browse/HIVE-4071
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4071_2.patch, HIVE-4071_3.patch, HIVE-4071.patch


 For example, if one sets the size of noConditionalTask.size to 10 with 
 corresponding auto join configurations set to true in auto_join28.q instead 
 of the current smalltable.filesize configuration, we will observe different 
 results if a select query is run. (The test only has explain statements at 
 present).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4093) Remove sprintf from PTFTranslator and use String.format()

2013-03-01 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591243#comment-13591243
 ] 

Ashutosh Chauhan commented on HIVE-4093:


Removing sprintf is useful, but I am not sure about testcase you added.

 Remove sprintf from PTFTranslator and use String.format()
 -

 Key: HIVE-4093
 URL: https://issues.apache.org/jira/browse/HIVE-4093
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Attachments: HIVE-4093-0.patch, HIVE-4093-1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4093) Remove sprintf from PTFTranslator and use String.format()

2013-03-01 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591249#comment-13591249
 ] 

Brock Noland commented on HIVE-4093:


Hi,

I added that test case to exercise this check:

https://github.com/apache/hive/blob/ptf-windowing/ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java#L560

Is that a limitation we should remove?

 Remove sprintf from PTFTranslator and use String.format()
 -

 Key: HIVE-4093
 URL: https://issues.apache.org/jira/browse/HIVE-4093
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Attachments: HIVE-4093-0.patch, HIVE-4093-1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4093) Remove sprintf from PTFTranslator and use String.format()

2013-03-01 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591254#comment-13591254
 ] 

Ashutosh Chauhan commented on HIVE-4093:


Ya we need to address this limitation. I would suggest for this jira we just 
address sprintf issue. Don't add this extra -ve testcase, file a new jira for 
this limitation.

 Remove sprintf from PTFTranslator and use String.format()
 -

 Key: HIVE-4093
 URL: https://issues.apache.org/jira/browse/HIVE-4093
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Attachments: HIVE-4093-0.patch, HIVE-4093-1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4093) Remove sprintf from PTFTranslator and use String.format()

2013-03-01 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591260#comment-13591260
 ] 

Brock Noland commented on HIVE-4093:


OK sounds good, will do!

 Remove sprintf from PTFTranslator and use String.format()
 -

 Key: HIVE-4093
 URL: https://issues.apache.org/jira/browse/HIVE-4093
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Attachments: HIVE-4093-0.patch, HIVE-4093-1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4082) Break up ptf tests in PTF, Windowing and Lead/Lag tests

2013-03-01 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4082:
--

Attachment: HIVE-4082.D9033.1.patch

pkalmegh requested code review of HIVE-4082 [jira] Break up ptf tests in PTF, 
Windowing and Lead/Lag tests.

Reviewers: JIRA

HIVE-4082: Refactor tests

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D9033

AFFECTED FILES
  data/files/flights_tiny.txt
  data/files/part.rc
  data/files/part.seq
  ql/src/test/queries/clientpositive/leadlag.q
  ql/src/test/queries/clientpositive/ptf.q
  ql/src/test/queries/clientpositive/ptf_general_queries.q
  ql/src/test/queries/clientpositive/ptf_npath.q
  ql/src/test/queries/clientpositive/ptf_window_boundaries.q
  ql/src/test/queries/clientpositive/windowing.q
  ql/src/test/results/clientpositive/leadlag.q.out
  ql/src/test/results/clientpositive/ptf.q.out
  ql/src/test/results/clientpositive/ptf_general_queries.q.out
  ql/src/test/results/clientpositive/ptf_npath.q.out
  ql/src/test/results/clientpositive/ptf_window_boundaries.q.out
  ql/src/test/results/clientpositive/windowing.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/21915/

To: JIRA, pkalmegh


 Break up ptf tests in PTF, Windowing and Lead/Lag tests
 ---

 Key: HIVE-4082
 URL: https://issues.apache.org/jira/browse/HIVE-4082
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Prajakta Kalmegh
 Attachments: HIVE-4082.D9033.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4107) Update Hive 0.10.0 RELEASE_NOTES.txt

2013-03-01 Thread Lefty Leverenz (JIRA)
Lefty Leverenz created HIVE-4107:


 Summary: Update Hive 0.10.0 RELEASE_NOTES.txt
 Key: HIVE-4107
 URL: https://issues.apache.org/jira/browse/HIVE-4107
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.10.0
Reporter: Lefty Leverenz


Hive release 0.10.0 includes a RELEASE_NOTES.txt file left over from release 
0.8.1 (branch-0.8-r2).

It needs to be updated to match the JIRA change log here:  
[https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12320745styleName=TextprojectId=12310843].

Thanks to Eric Chu for drawing attention to this problem on 
u...@hive.apache.org.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3908) create view statement's outputs contains the view and a temporary dir.

2013-03-01 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-3908:
--

Status: Patch Available  (was: Open)

Patch attached

 create view statement's outputs contains the view and a temporary dir.
 --

 Key: HIVE-3908
 URL: https://issues.apache.org/jira/browse/HIVE-3908
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Prasad Mujumdar
 Attachments: HIVE-3908-1.patch


 It should only contain the view

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-3908) create view statement's outputs contains the view and a temporary dir.

2013-03-01 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar reassigned HIVE-3908:
-

Assignee: Prasad Mujumdar

 create view statement's outputs contains the view and a temporary dir.
 --

 Key: HIVE-3908
 URL: https://issues.apache.org/jira/browse/HIVE-3908
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Prasad Mujumdar
 Attachments: HIVE-3908-1.patch


 It should only contain the view

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3908) create view statement's outputs contains the view and a temporary dir.

2013-03-01 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-3908:
--

Attachment: HIVE-3908-1.patch

 create view statement's outputs contains the view and a temporary dir.
 --

 Key: HIVE-3908
 URL: https://issues.apache.org/jira/browse/HIVE-3908
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Prasad Mujumdar
 Attachments: HIVE-3908-1.patch


 It should only contain the view

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3908) create view statement's outputs contains the view and a temporary dir.

2013-03-01 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591346#comment-13591346
 ] 

Prasad Mujumdar commented on HIVE-3908:
---

Review request on https://reviews.facebook.net/D9039

 create view statement's outputs contains the view and a temporary dir.
 --

 Key: HIVE-3908
 URL: https://issues.apache.org/jira/browse/HIVE-3908
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Prasad Mujumdar
 Attachments: HIVE-3908-1.patch


 It should only contain the view

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3908) create view statement's outputs contains the view and a temporary dir.

2013-03-01 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-3908:
--

  Component/s: Query Processor
Affects Version/s: 0.10.0

 create view statement's outputs contains the view and a temporary dir.
 --

 Key: HIVE-3908
 URL: https://issues.apache.org/jira/browse/HIVE-3908
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Namit Jain
Assignee: Prasad Mujumdar
 Attachments: HIVE-3908-1.patch


 It should only contain the view

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira