Re: Last time request for cwiki update privileges

2013-08-21 Thread Nitin Pawar
Sanjay,

There are lots of emails on hive forum and individual mails may have been
lost.

can you try reaching one of the hive PMC member (Or Lefty from Hortonworks)

Thanks,
Nitin


On Wed, Aug 21, 2013 at 4:30 AM, Sanjay Subramanian 
sanjay.subraman...@wizecommerce.com wrote:

 Thanks Ashutosh

 From: Ashutosh Chauhan hashut...@apache.orgmailto:hashut...@apache.org
 Reply-To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Date: Tuesday, August 20, 2013 3:13 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org dev@hive.apache.org
 mailto:dev@hive.apache.org
 Subject: Re: Last time request for cwiki update privileges

 Hi Sanjay,

 Really sorry for that. I apologize for the delay. You are added now. Feel
 free to make changes to make Hive even better!

 Thanks,
 Ashutosh


 On Tue, Aug 20, 2013 at 2:39 PM, Sanjay Subramanian 
 sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com wrote:
 Hey guys

 I can only think of two reasons for my request is not yet accepted

 1. The admins don't want to give me access

 2. The admins have not seen my mail yet.

 This is the fourth and the LAST time I am requesting permission to edit
 wiki docs…Nobody likes being ignored and that includes me.

 Meanwhile to show my thankfulness to the Hive community I shall continue
 to answer questions .There will be no change in that behavior

 Regards

 sanjay




 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com
 Date: Wednesday, August 14, 2013 3:52 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org dev@hive.apache.org
 mailto:dev@hive.apache.org
 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

 Once again, I am down on my knees humbling calling upon the Hive Jedi
 Masters to please provide this paadwaan  with cwiki update privileges

 May the Force be with u

 Thanks

 sanjay

 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com
 Reply-To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Date: Wednesday, July 31, 2013 9:38 AM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org dev@hive.apache.org
 mailto:dev@hive.apache.org
 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

 Hi guys

 Any chance I could get cwiki update privileges today ?

 Thanks

 sanjay

 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com
 Date: Tuesday, July 30, 2013 4:26 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org dev@hive.apache.org
 mailto:dev@hive.apache.org
 Subject: Review Request (wikidoc): LZO Compression in Hive

 Hi

 Met with Lefty this afternoon and she was kind to spend time to add my
 documentation to the site - since I still don't have editing privileges :-)

 Please review the new wikidoc about LZO compression in the Hive language
 manual.  If anything is unclear or needs more information, you can email
 suggestions to this list or edit the wiki yourself (if you have editing
 privileges).  Here are the links:

   1.  Language Manual
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual (new
 bullet under File Formats)
   2.  LZO Compression
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO
   3.  CREATE TABLE
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
 (near end of section, pasted in here:)
 Use STORED AS TEXTFILE if the data needs to be stored as plain text files.
 Use STORED AS SEQUENCEFILE if the data needs to be compressed. Please read
 more about CompressedStorage
 https://cwiki.apache.org/confluence/display/Hive/CompressedStorage if
 you are planning to keep data compressed in your Hive tables. Use
 INPUTFORMAT and OUTPUTFORMAT to specify the name of a corresponding
 InputFormat and OutputFormat class as a string literal, e.g.,
 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.
 For LZO compression, the values to use are 'INPUTFORMAT
 com.hadoop.mapred.DeprecatedLzoTextInputFormat OUTPUTFORMAT
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' (see LZO
 Compression
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO).

 My cwiki id is
 https://cwiki.apache.org/confluence/display/~sanjaysubraman...@yahoo.com
 It will be great if I could get edit privileges

 Thanks
 sanjay

 CONFIDENTIALITY NOTICE
 ==
 This email message and any attachments are for 

Re: [Discuss] project chop up

2013-08-21 Thread amareshwari sriramdasu
Sounds great! Looking forward !


On Tue, Aug 20, 2013 at 7:58 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 Just an update. This is going very well:

 NFO] Nothing to compile - all classes are up to date
 [INFO]
 
 [INFO] Reactor Summary:
 [INFO]
 [INFO] Apache Hive ... SUCCESS [0.002s]
 [INFO] hive-shims-x .. SUCCESS [1.210s]
 [INFO] hive-shims-20 . SUCCESS [0.125s]
 [INFO] hive-common ... SUCCESS [0.082s]
 [INFO] hive-serde  SUCCESS [2.521s]
 [INFO] hive-metastore  SUCCESS
 [10.818s]
 [INFO] hive-exec . SUCCESS [4.521s]
 [INFO] hive-avro . SUCCESS [1.582s]
 [INFO] hive-zookeeper  SUCCESS [0.519s]
 [INFO]
 
 [INFO] BUILD SUCCESS
 [INFO]
 
 [INFO] Total time: 21.613s
 [INFO] Finished at: Tue Aug 20 10:23:34 EDT 2013
 [INFO] Final Memory: 39M/408M


 Though I did some short cuts and disabled some tests. We can build hive
 very fast, including incremental builds. Also we are using maven plugins to
 compile antlr, thrift, protobuf, datanucleas and building those every time.


 On Fri, Aug 16, 2013 at 11:16 PM, Xuefu Zhang xzh...@cloudera.com wrote:

  Thanks, Edward.
 
  I'm big +1 to mavenize Hive. Hive has long reached a point where it's
 hard
  to manage its build using ant. I'd like to help on this too.
 
  Thanks,
  Xuefu
 
 
  On Fri, Aug 16, 2013 at 7:31 PM, Edward Capriolo edlinuxg...@gmail.com
  wrote:
 
   For those interested in pitching in.
   https://github.com/edwardcapriolo/hive
  
  
  
   On Fri, Aug 16, 2013 at 11:58 AM, Edward Capriolo 
 edlinuxg...@gmail.com
   wrote:
  
Summary from hive-irc channel. Minor edits for spell check/grammar.
   
The last 10 lines are a summary of the key points.
   
[10:59:17] ecapriolo noland: et all. Do you want to talk about hive
  in
maven?
[11:01:06] smonchi [~
ro...@host34-189-dynamic.23-79-r.retail.telecomitalia.it] has quit
  IRC:
Quit: ... 'cause there is no patch for human stupidity ...
[11:10:04] noland ecapriolo: yeah that sounds good to me!
[11:10:22] noland I saw you created the jira but haven't had time
 to
   look
[11:10:32] ecapriolo So I found a few things
[11:10:49] ecapriolo In common there is one or two testats that
   actually
fork a process :)
[11:10:56] ecapriolo and use build.test.resources
[11:11:12] ecapriolo Some serde, uses some methods from ql in
 testing
[11:11:27] ecapriolo and shims really needs a separate hadoop test
  shim
[11:11:32] ecapriolo But that is all simple stuff
[11:11:47] ecapriolo The biggest problem is I do not know how to
  solve
shims with maven
[11:11:50] ecapriolo do you have any ideas
[11:11:52] ecapriolo ?
[11:13:00] noland That one is going to be a challenge. It might be
  that
in that section we have to drop down to ant
[11:14:44] noland Is it a requirement that we build both the .20
 and
   .23
shims for a package as we do today?
[11:16:46] ecapriolo I was thinking we can do it like a JDBC driver
[11:16:59] ecapriolo Se separate out the interface of shims
[11:17:22] ecapriolo And then at runtime we drop in a driver
   implementing
[11:17:34] Wertax [~wer...@wolfkamp.xs4all.nl] has quit IRC: Remote
  host
closed the connection
[11:17:36] ecapriolo That or we could use maven's profile system
[11:18:09] ecapriolo It seems that everything else can actually
 link
against hadoop-0.20.2 as a provided dependency
[11:18:37] noland Yeah either would work. The driver method would
probably require use to use ant build both the drivers?
[11:18:44] noland I am a fan of mvn profiles
[11:19:05] ecapriolo I was thinking we kinda separate the shim out
  into
its own project,, not a module
[11:19:10] ecapriolo to achive that jdbc thing
[11:19:27] ecapriolo But I do not have a solution yet, I was
 looking
  to
farm that out to someone smart...like you :)
[11:19:33] noland :)
[11:19:47] ecapriolo All I know is that we need a test shim because
HadoopShim requires hadoop-test jars
[11:20:10] ecapriolo then the Mini stuff is only used in qtest
 anyway
[11:20:48] ecapriolo Is this something you want to help with? I was
thinking of spinning up a github
[11:20:50] noland I think that the separate projects would work and
perhaps nicely.
[11:21:01] noland Yeah I'd be interested in helping!
[11:21:17] noland But I am going on vacation starting next week for
about 10 days
[11:21:27] 

[jira] [Updated] (HIVE-664) optimize UDF split

2013-08-21 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-664:


Attachment: HIVE-664.1.patch.txt

 optimize UDF split
 --

 Key: HIVE-664
 URL: https://issues.apache.org/jira/browse/HIVE-664
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Namit Jain
Assignee: Teddy Choi
  Labels: optimization
 Attachments: HIVE-664.1.patch.txt


 Min Zhou added a comment - 21/Jul/09 07:34 AM
 It's very useful for us .
 some comments:
1. Can you implement it directly with Text ? Avoiding string decoding and 
 encoding would be faster. Of course that trick may lead to another problem, 
 as String.split uses a regular expression for splitting.
2. getDisplayString() always return a string in lowercase.
 [ Show » ]
 Min Zhou added a comment - 21/Jul/09 07:34 AM It's very useful for us . some 
 comments:
1. Can you implement it directly with Text ? Avoiding string decoding and 
 encoding would be faster. Of course that trick may lead to another problem, 
 as String.split uses a regular expression for splitting.
2. getDisplayString() always return a string in lowercase.
 [ Permlink | « Hide ]
 Namit Jain added a comment - 21/Jul/09 09:22 AM
 Committed. Thanks Emil
 [ Show » ]
 Namit Jain added a comment - 21/Jul/09 09:22 AM Committed. Thanks Emil
 [ Permlink | « Hide ]
 Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM
 There are some easy (compromise) ways to optimize split:
 1. Check if the regex argument actually contains some regex specific 
 characters and if it doesn't, do a straightforward split without converting 
 to strings.
 2. Assume some default value for the second argument (for example - 
 split(str) to be equivalent to split(str, ' ') and optimize for this value
 3. Have two separate split functions - one that does regex and one that 
 splits around plain text.
 I think that 1 is a good choice and can be done rather quickly.
 [ Show » ]
 Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM There are some easy 
 (compromise) ways to optimize split: 1. Check if the regex argument actually 
 contains some regex specific characters and if it doesn't, do a 
 straightforward split without converting to strings. 2. Assume some default 
 value for the second argument (for example - split(str) to be equivalent to 
 split(str, ' ') and optimize for this value 3. Have two separate split 
 functions - one that does regex and one that splits around plain text. I 
 think that 1 is a good choice and can be done rather quickly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-664) optimize UDF split

2013-08-21 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-664:


Attachment: HIVE-664.2.patch.txt

I implemented 1 and 3. Additionally, it caches a compiled Pattern object to 
reuse.

 optimize UDF split
 --

 Key: HIVE-664
 URL: https://issues.apache.org/jira/browse/HIVE-664
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Namit Jain
Assignee: Teddy Choi
  Labels: optimization
 Attachments: HIVE-664.1.patch.txt, HIVE-664.2.patch.txt


 Min Zhou added a comment - 21/Jul/09 07:34 AM
 It's very useful for us .
 some comments:
1. Can you implement it directly with Text ? Avoiding string decoding and 
 encoding would be faster. Of course that trick may lead to another problem, 
 as String.split uses a regular expression for splitting.
2. getDisplayString() always return a string in lowercase.
 [ Show » ]
 Min Zhou added a comment - 21/Jul/09 07:34 AM It's very useful for us . some 
 comments:
1. Can you implement it directly with Text ? Avoiding string decoding and 
 encoding would be faster. Of course that trick may lead to another problem, 
 as String.split uses a regular expression for splitting.
2. getDisplayString() always return a string in lowercase.
 [ Permlink | « Hide ]
 Namit Jain added a comment - 21/Jul/09 09:22 AM
 Committed. Thanks Emil
 [ Show » ]
 Namit Jain added a comment - 21/Jul/09 09:22 AM Committed. Thanks Emil
 [ Permlink | « Hide ]
 Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM
 There are some easy (compromise) ways to optimize split:
 1. Check if the regex argument actually contains some regex specific 
 characters and if it doesn't, do a straightforward split without converting 
 to strings.
 2. Assume some default value for the second argument (for example - 
 split(str) to be equivalent to split(str, ' ') and optimize for this value
 3. Have two separate split functions - one that does regex and one that 
 splits around plain text.
 I think that 1 is a good choice and can be done rather quickly.
 [ Show » ]
 Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM There are some easy 
 (compromise) ways to optimize split: 1. Check if the regex argument actually 
 contains some regex specific characters and if it doesn't, do a 
 straightforward split without converting to strings. 2. Assume some default 
 value for the second argument (for example - split(str) to be equivalent to 
 split(str, ' ') and optimize for this value 3. Have two separate split 
 functions - one that does regex and one that splits around plain text. I 
 think that 1 is a good choice and can be done rather quickly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-664) optimize UDF split

2013-08-21 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745844#comment-13745844
 ] 

Teddy Choi commented on HIVE-664:
-

Review request on https://reviews.apache.org/r/13702/

 optimize UDF split
 --

 Key: HIVE-664
 URL: https://issues.apache.org/jira/browse/HIVE-664
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Namit Jain
Assignee: Teddy Choi
  Labels: optimization
 Attachments: HIVE-664.1.patch.txt, HIVE-664.2.patch.txt


 Min Zhou added a comment - 21/Jul/09 07:34 AM
 It's very useful for us .
 some comments:
1. Can you implement it directly with Text ? Avoiding string decoding and 
 encoding would be faster. Of course that trick may lead to another problem, 
 as String.split uses a regular expression for splitting.
2. getDisplayString() always return a string in lowercase.
 [ Show » ]
 Min Zhou added a comment - 21/Jul/09 07:34 AM It's very useful for us . some 
 comments:
1. Can you implement it directly with Text ? Avoiding string decoding and 
 encoding would be faster. Of course that trick may lead to another problem, 
 as String.split uses a regular expression for splitting.
2. getDisplayString() always return a string in lowercase.
 [ Permlink | « Hide ]
 Namit Jain added a comment - 21/Jul/09 09:22 AM
 Committed. Thanks Emil
 [ Show » ]
 Namit Jain added a comment - 21/Jul/09 09:22 AM Committed. Thanks Emil
 [ Permlink | « Hide ]
 Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM
 There are some easy (compromise) ways to optimize split:
 1. Check if the regex argument actually contains some regex specific 
 characters and if it doesn't, do a straightforward split without converting 
 to strings.
 2. Assume some default value for the second argument (for example - 
 split(str) to be equivalent to split(str, ' ') and optimize for this value
 3. Have two separate split functions - one that does regex and one that 
 splits around plain text.
 I think that 1 is a good choice and can be done rather quickly.
 [ Show » ]
 Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM There are some easy 
 (compromise) ways to optimize split: 1. Check if the regex argument actually 
 contains some regex specific characters and if it doesn't, do a 
 straightforward split without converting to strings. 2. Assume some default 
 value for the second argument (for example - split(str) to be equivalent to 
 split(str, ' ') and optimize for this value 3. Have two separate split 
 functions - one that does regex and one that splits around plain text. I 
 think that 1 is a good choice and can be done rather quickly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-664) optimize UDF split

2013-08-21 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-664:


Status: Patch Available  (was: In Progress)

 optimize UDF split
 --

 Key: HIVE-664
 URL: https://issues.apache.org/jira/browse/HIVE-664
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Namit Jain
Assignee: Teddy Choi
  Labels: optimization
 Attachments: HIVE-664.1.patch.txt, HIVE-664.2.patch.txt


 Min Zhou added a comment - 21/Jul/09 07:34 AM
 It's very useful for us .
 some comments:
1. Can you implement it directly with Text ? Avoiding string decoding and 
 encoding would be faster. Of course that trick may lead to another problem, 
 as String.split uses a regular expression for splitting.
2. getDisplayString() always return a string in lowercase.
 [ Show » ]
 Min Zhou added a comment - 21/Jul/09 07:34 AM It's very useful for us . some 
 comments:
1. Can you implement it directly with Text ? Avoiding string decoding and 
 encoding would be faster. Of course that trick may lead to another problem, 
 as String.split uses a regular expression for splitting.
2. getDisplayString() always return a string in lowercase.
 [ Permlink | « Hide ]
 Namit Jain added a comment - 21/Jul/09 09:22 AM
 Committed. Thanks Emil
 [ Show » ]
 Namit Jain added a comment - 21/Jul/09 09:22 AM Committed. Thanks Emil
 [ Permlink | « Hide ]
 Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM
 There are some easy (compromise) ways to optimize split:
 1. Check if the regex argument actually contains some regex specific 
 characters and if it doesn't, do a straightforward split without converting 
 to strings.
 2. Assume some default value for the second argument (for example - 
 split(str) to be equivalent to split(str, ' ') and optimize for this value
 3. Have two separate split functions - one that does regex and one that 
 splits around plain text.
 I think that 1 is a good choice and can be done rather quickly.
 [ Show » ]
 Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM There are some easy 
 (compromise) ways to optimize split: 1. Check if the regex argument actually 
 contains some regex specific characters and if it doesn't, do a 
 straightforward split without converting to strings. 2. Assume some default 
 value for the second argument (for example - split(str) to be equivalent to 
 split(str, ' ') and optimize for this value 3. Have two separate split 
 functions - one that does regex and one that splits around plain text. I 
 think that 1 is a good choice and can be done rather quickly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5129) Multiple table insert fails on count(distinct)

2013-08-21 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-5129:
-

Status: Patch Available  (was: Open)

 Multiple table insert fails on count(distinct)
 --

 Key: HIVE-5129
 URL: https://issues.apache.org/jira/browse/HIVE-5129
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: aggrTestMultiInsertData1.txt, 
 aggrTestMultiInsertData.txt, aggrTestMultiInsert.q, HIVE-5129.1.patch.txt


 Hive fails with a class cast exception on queries of the form:
 {noformat}
 from studenttab10k
 insert overwrite table multi_insert_2_1
 select name, avg(age) as avgage
 group by name
 insert overwrite table multi_insert_2_2
 select name, age, sum(gpa) as sumgpa
 group by name, age
 insert overwrite table multi_insert_2_3
 select name, count(distinct age) as distage
 group by name;
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5129) Multiple table insert fails on count(distinct)

2013-08-21 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745873#comment-13745873
 ] 

Navis commented on HIVE-5129:
-

mGBY-1RS optimization is really confusing with distinct functions. IMHO, it 
should not be allowed to mix distinct and non-distinct cases into one group. 
For example, 
{noformat}
from src tablesample (10 ROWS)
insert overwrite table src_a select key, count(distinct key) + count(distinct 
value) group by key;
{noformat}
makes 20 rows for src_a, and,
{noformat}
from src tablesample (10 ROWS)
insert overwrite table src_b select key, count(value) group by key, value;
{noformat}
makes 10 rows for src_b, but,
{noformat}
from src tablesample (10 ROWS)
insert overwrite table src_a select key, count(distinct key) + count(distinct 
value) group by key
insert overwrite table src_b select key, count(value) group by key, value;
{noformat}
makes 20 rows for src_a and src_b, and lastly,
{noformat}
from src tablesample (10 ROWS)
insert overwrite table src_b select key, count(value) group by key, value
insert overwrite table src_a select key, count(distinct key) + count(distinct 
value) group by key;
{noformat}
is not working as described in this issue. After applying your patch, it 
succeeded with 10 rows for both.


 Multiple table insert fails on count(distinct)
 --

 Key: HIVE-5129
 URL: https://issues.apache.org/jira/browse/HIVE-5129
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: aggrTestMultiInsertData1.txt, 
 aggrTestMultiInsertData.txt, aggrTestMultiInsert.q, HIVE-5129.1.patch.txt


 Hive fails with a class cast exception on queries of the form:
 {noformat}
 from studenttab10k
 insert overwrite table multi_insert_2_1
 select name, avg(age) as avgage
 group by name
 insert overwrite table multi_insert_2_2
 select name, age, sum(gpa) as sumgpa
 group by name, age
 insert overwrite table multi_insert_2_3
 select name, count(distinct age) as distage
 group by name;
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4645) Stat information like numFiles and totalSize is not correct when sub-directory is exists

2013-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746001#comment-13746001
 ] 

Hudson commented on HIVE-4645:
--

ABORTED: Integrated in Hive-trunk-hadoop2 #373 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/373/])
HIVE-4645: Stat information like numFiles and totalSize is not correct when 
sub-directory is exists (Navis via Brock Noland) (brock: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1515865)
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
* /hive/trunk/ql/src/test/queries/clientpositive/list_bucket_dml_7.q
* /hive/trunk/ql/src/test/queries/clientpositive/list_bucket_dml_8.q
* 
/hive/trunk/ql/src/test/results/clientpositive/infer_bucket_sort_list_bucket.q.out
* /hive/trunk/ql/src/test/results/clientpositive/list_bucket_dml_7.q.out
* /hive/trunk/ql/src/test/results/clientpositive/list_bucket_dml_8.q.out
* /hive/trunk/ql/src/test/results/clientpositive/stats_noscan_2.q.out


 Stat information like numFiles and totalSize is not correct when 
 sub-directory is exists
 

 Key: HIVE-4645
 URL: https://issues.apache.org/jira/browse/HIVE-4645
 Project: Hive
  Issue Type: Test
  Components: Statistics
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.12.0

 Attachments: HIVE-4645.D11037.1.patch, HIVE-4645.D11037.2.patch, 
 HIVE-4645.D11037.3.patch, HIVE-4645.D11037.4.patch


 The test infer_bucket_sort_list_bucket.q returns 4096 as totalSize but it's 
 size of parent directory, not sum of file size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5121) Remove obsolete code on SemanticAnalyzer#genJoinTree

2013-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746003#comment-13746003
 ] 

Hudson commented on HIVE-5121:
--

ABORTED: Integrated in Hive-trunk-hadoop2 #373 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/373/])
HIVE-5121 : Remove obsolete code on SemanticAnalyzer#genJoinTree (Azrael Park 
via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1515838)
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java


 Remove obsolete code on SemanticAnalyzer#genJoinTree
 

 Key: HIVE-5121
 URL: https://issues.apache.org/jira/browse/HIVE-5121
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.11.0
 Environment: ubuntu 12.04
Reporter: Azrael
Assignee: Azrael
Priority: Trivial
 Fix For: 0.12.0

 Attachments: HIVE-5121.D12405.1.patch


 Remove obsolete code on SemanticAnalyzer#genJoinTree.
 {noformat}
children[1] = alias;
joinTree.setBaseSrc(children);
 -  aliasToOpInfo.get(alias);
joinTree.setId(qb.getId());
joinTree.getAliasToOpInfo().put(
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4299) exported metadata by HIVE-3068 cannot be imported because of wrong file name

2013-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746000#comment-13746000
 ] 

Hudson commented on HIVE-4299:
--

ABORTED: Integrated in Hive-trunk-hadoop2 #373 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/373/])
HIVE-4299 : exported metadata by HIVE3068 cannot be imported because of wrong 
file name (Sho Shimauchi  Edward Capriolo via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1515839)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/MetaDataExportListener.java


 exported metadata by HIVE-3068 cannot be imported because of wrong file name
 

 Key: HIVE-4299
 URL: https://issues.apache.org/jira/browse/HIVE-4299
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Sho Shimauchi
Assignee: Edward Capriolo
 Fix For: 0.12.0

 Attachments: HIVE-4299.1.patch.txt, HIVE-4299.4.patch.txt, 
 HIVE-4299.5.patch.txt, HIVE-4299.patch


 h2. Symptom
 When DROP TABLE a table, metadata of the table is generated to be able to 
 import the dropped table again.
 However, the exported metadata name is 'table name.metadata'.
 Since ImportSemanticAnalyzer allows only '_metadata' as metadata filename, 
 user have to rename the metadata file to import the table.
 h2. How to reproduce
 Set the following setting to hive-site.xml:
 {code}
  property
namehive.metastore.pre.event.listeners/name
valueorg.apache.hadoop.hive.ql.parse.MetaDataExportListener/value
  /property
 {code}
 Then run the following queries:
 {code}
  CREATE TABLE test_table (id INT, name STRING);
  DROP TABLE test_table;
  IMPORT TABLE test_table_imported FROM '/path/to/metadata/file';
 FAILED: SemanticException [Error 10027]: Invalid path
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5117) orc_dictionary_threshold is not deterministic

2013-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745999#comment-13745999
 ] 

Hudson commented on HIVE-5117:
--

ABORTED: Integrated in Hive-trunk-hadoop2 #373 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/373/])
HIVE-5117: orc_dictionary_threshold is not deterministic (Navis via Ashutosh 
Chauhan) (brock: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1515911)
* /hive/trunk/ql/src/test/queries/clientpositive/orc_dictionary_threshold.q
* /hive/trunk/ql/src/test/results/clientpositive/orc_dictionary_threshold.q.out


 orc_dictionary_threshold is not deterministic
 -

 Key: HIVE-5117
 URL: https://issues.apache.org/jira/browse/HIVE-5117
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.12.0

 Attachments: HIVE-5117.D12363.1.patch


 orc_dictionary_threshold.q makes different result on hadoop2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5120) document what hive.server2.thrift.sasl.qop values mean in hive-default.xml.template

2013-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745998#comment-13745998
 ] 

Hudson commented on HIVE-5120:
--

ABORTED: Integrated in Hive-trunk-hadoop2 #373 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/373/])
HIVE-5120 : document what hive.server2.thrift.sasl.qop values mean in 
hive-default.xml.template (Thejas Nair via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1515837)
* /hive/trunk/conf/hive-default.xml.template


 document what hive.server2.thrift.sasl.qop values mean in 
 hive-default.xml.template
 ---

 Key: HIVE-5120
 URL: https://issues.apache.org/jira/browse/HIVE-5120
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.12.0

 Attachments: HIVE-5120.1.patch, HIVE-5120.2.patch


 Current description of configuration does not say what the values for the 
 hive.server2.thrift.sasl.qop property mean, and also does not say that it 
 works only with kerberos auth turned on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4779) Enhance coverage of package org.apache.hadoop.hive.ql.udf

2013-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746002#comment-13746002
 ] 

Hudson commented on HIVE-4779:
--

ABORTED: Integrated in Hive-trunk-hadoop2 #373 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/373/])
HIVE-4779 : Enhance coverage of package org.apache.hadoop.hive.ql.udf (Ivan 
Veselovsky via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1515946)
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseCompare.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToDate.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFUnixTimeStamp.java
* /hive/trunk/ql/src/test/queries/clientpositive/create_udaf.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf4.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_pmod.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_boolean.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_byte.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_double.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_float.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_long.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_short.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_string.q
* /hive/trunk/ql/src/test/results/clientpositive/create_udaf.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_pmod.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_boolean.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_byte.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_double.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_float.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_long.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_short.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_string.q.out


 Enhance coverage of package org.apache.hadoop.hive.ql.udf
 -

 Key: HIVE-4779
 URL: https://issues.apache.org/jira/browse/HIVE-4779
 Project: Hive
  Issue Type: Test
Affects Versions: 0.12.0
Reporter: Ivan A. Veselovsky
Assignee: Ivan A. Veselovsky
 Fix For: 0.12.0

 Attachments: HIVE-4779.patch, HIVE-4779-trunk--N1.patch


 Enhance coverage of package org.apache.hadoop.hive.ql.udf up to 80%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4963) Support in memory PTF partitions

2013-08-21 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746034#comment-13746034
 ] 

Edward Capriolo commented on HIVE-4963:
---

I have a couple small comments.

The variable sz i do not think we need it. Cant we determine the size from the 
collection. A couple places were we are using array list on the left side.

 Support in memory PTF partitions
 

 Key: HIVE-4963
 URL: https://issues.apache.org/jira/browse/HIVE-4963
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
 Attachments: HIVE-4963.D11955.1.patch, HIVE-4963.D12279.1.patch, 
 HIVE-4963.D12279.2.patch, PTFRowContainer.patch


 PTF partitions apply the defensive mode of assuming that partitions will not 
 fit in memory. Because of this there is a significant deserialization 
 overhead when accessing elements. 
 Allow the user to specify that there is enough memory to hold partitions 
 through a 'hive.ptf.partition.fits.in.mem' option.  
 Savings depends on partition size and in case of windowing the number of 
 UDAFs and the window ranges. For eg for the following (admittedly extreme) 
 case the PTFOperator exec times went from 39 secs to 8 secs.
  
 {noformat}
 select t, s, i, b, f, d,
 min(t) over(partition by 1 rows between unbounded preceding and current row), 
 min(s) over(partition by 1 rows between unbounded preceding and current row), 
 min(i) over(partition by 1 rows between unbounded preceding and current row), 
 min(b) over(partition by 1 rows between unbounded preceding and current row) 
 from over10k
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: No java compiler available exception for HWI

2013-08-21 Thread Edward Capriolo
We rally should pre compile the jsp. There ia a jira on this somewhere.

On Tuesday, August 20, 2013, Bing Li sarah.lib...@gmail.com wrote:
 Hi, Eric et al
 Did you resolve this failure?
 I'm using Hive-0.11.0, and get the same error when access to HWI via
browser.

 I already set the following properties in hive-site.xml
 - hive.hwi.listen.host
 - hive.hwi.listen.port
 - hive.hwi.war.file

 And copied two jasper jars into hive/lib:
 - jasper-compiler-5.5.23.jar
 - jasper-runtime-5.5.23.jar

 Thanks,
 - Bing

 2013/8/20 Bing Li sarah.lib...@gmail.com

 Hi, Eric et al
 Did you resolve this failure?
 I'm using Hive-0.11.0, and get the same error when access to HWI via
browser.

 I already set the following properties in hive-site.xml
 - hive.hwi.listen.host
 - hive.hwi.listen.port
 - hive.hwi.war.file

 And copied two jasper jars into hive/lib:
 - jasper-compiler-5.5.23.jar
 - jasper-runtime-5.5.23.jar

 Thanks,
 - Bing


 2013/3/30 Eric Chu e...@rocketfuel.com

 Hi,
 I'm running Hive 0.10 and I want to support HWI (besides CLI and HUE).
When I started HWI I didn't get any error. However, when I went to Hive
Server Address:/hwi on my browser I saw the error below complaining
about No Java compiler available. My JAVA_HOME is set
to /usr/lib/jvm/java-1.6.0-sun-1.6.0.16.
 Besides https://cwiki.apache.org/Hive/hivewebinterface.html, there's not
much documentation on HWI. I'm wondering if anyone else has seen this or
has any idea about what's wrong?
 Thanks.
 Eric

 Problem accessing /hwi/. Reason:

 No Java compiler available

 Caused by:

 java.lang.IllegalStateException: No Java compiler available
 at
org.apache.jasper.JspCompilationContext.createCompiler(JspCompilationContext.java:225)
 at
org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:560)
 at
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:299)
 at
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:315)
 at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
 at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:327)
 at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:126)
 at org.mortbay.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:503)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
 at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at
org.mortbay.jetty.handler.RequestLogHandler.handle(RequestLogHandler.java:49)
 at org.mortbay.jetty.handler.


[jira] [Updated] (HIVE-5131) JDBC client's hive variables are not passed to HS2

2013-08-21 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-5131:
--

Fix Version/s: 0.12.0
Affects Version/s: 0.11.0
   Status: Patch Available  (was: Open)

 JDBC client's hive variables are not passed to HS2
 --

 Key: HIVE-5131
 URL: https://issues.apache.org/jira/browse/HIVE-5131
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.12.0

 Attachments: HIVE-5131.patch, HIVE-5131.patch


 Related to HIVE-2914. However, HIVE-2914 seems addressing Hive CLI only. JDBC 
 clients suffer the same problem. This was identified in HIVE-4568. I decided 
 it might be better to separate issue from a different issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5131) JDBC client's hive variables are not passed to HS2

2013-08-21 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-5131:
--

Component/s: JDBC

 JDBC client's hive variables are not passed to HS2
 --

 Key: HIVE-5131
 URL: https://issues.apache.org/jira/browse/HIVE-5131
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.12.0

 Attachments: HIVE-5131.patch, HIVE-5131.patch


 Related to HIVE-2914. However, HIVE-2914 seems addressing Hive CLI only. JDBC 
 clients suffer the same problem. This was identified in HIVE-4568. I decided 
 it might be better to separate issue from a different issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5131) JDBC client's hive variables are not passed to HS2

2013-08-21 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-5131:
--

Attachment: HIVE-5131.patch

Patch updated with test case

 JDBC client's hive variables are not passed to HS2
 --

 Key: HIVE-5131
 URL: https://issues.apache.org/jira/browse/HIVE-5131
 Project: Hive
  Issue Type: Bug
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-5131.patch, HIVE-5131.patch


 Related to HIVE-2914. However, HIVE-2914 seems addressing Hive CLI only. JDBC 
 clients suffer the same problem. This was identified in HIVE-4568. I decided 
 it might be better to separate issue from a different issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [jira] [Created] (HIVE-5132) Can't access to hwi

2013-08-21 Thread Edward Capriolo
You might be able to fix this by using a JDK/SDK not only a jre.


On Wed, Aug 21, 2013 at 1:38 AM, Bing Li (JIRA) j...@apache.org wrote:

 Bing Li created HIVE-5132:
 -

  Summary: Can't access to hwi
  Key: HIVE-5132
  URL: https://issues.apache.org/jira/browse/HIVE-5132
  Project: Hive
   Issue Type: Bug
 Affects Versions: 0.11.0, 0.10.0
  Environment: JDK1.6, hadoop 2.0.4-alpha
 Reporter: Bing Li
 Priority: Critical


 I want to use hwi to submit hive queries, but after start hwi
 successfully, I can't open the web page of it.

 I noticed that someone also met the same issue in hive-0.10.

 Reproduce steps:
 --
 1. start hwi
 bin/hive --config $HIVE_CONF_DIR --service hwi

 2. access to http://hive_hwi_node:/hwi via browser

 got the following error message:

 HTTP ERROR 500
 Problem accessing /hwi/. Reason:

 No Java compiler available

 Caused by:
 java.lang.IllegalStateException: No Java compiler available
 at
 org.apache.jasper.JspCompilationContext.createCompiler(JspCompilationContext.java:225)
 at
 org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:560)
 at
 org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:299)
 at
 org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:315)
 at
 org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
 at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at
 org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:327)
 at
 org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:126)
 at
 org.mortbay.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:503)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
 at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at
 org.mortbay.jetty.handler.RequestLogHandler.handle(RequestLogHandler.java:49)
 at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:326)
 at
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 at
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
 at
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)



 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA
 administrators
 For more information on JIRA, see: http://www.atlassian.com/software/jira



[jira] [Updated] (HIVE-4214) OVER accepts general expression instead of just function

2013-08-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4214:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Harish for the review!

 OVER accepts general expression instead of just function
 

 Key: HIVE-4214
 URL: https://issues.apache.org/jira/browse/HIVE-4214
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Ashutosh Chauhan
 Fix For: 0.12.0

 Attachments: HIVE-4214.1.patch, HIVE-4214.3.patch, HIVE-4214.patch


 The query:
 select s, i, avg(d) / 10.0 over (partition by s order by i) from over100k;
 runs (and produces meaningless output).
 Over should not allow the arithmetic expression.  Only a UDAF or PTF function 
 should be valid there.  The correct way to write this query should be 
 select s, i, avg(d) over (partition by s order by i) / 10. 0 from over100k;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: No java compiler available exception for HWI

2013-08-21 Thread Bing Li
Hi, Edward
I filed it as HIVE-5132, did you mean this one?


2013/8/21 Edward Capriolo edlinuxg...@gmail.com

 We rally should pre compile the jsp. There ia a jira on this somewhere.

 On Tuesday, August 20, 2013, Bing Li sarah.lib...@gmail.com wrote:
  Hi, Eric et al
  Did you resolve this failure?
  I'm using Hive-0.11.0, and get the same error when access to HWI via
 browser.
 
  I already set the following properties in hive-site.xml
  - hive.hwi.listen.host
  - hive.hwi.listen.port
  - hive.hwi.war.file
 
  And copied two jasper jars into hive/lib:
  - jasper-compiler-5.5.23.jar
  - jasper-runtime-5.5.23.jar
 
  Thanks,
  - Bing
 
  2013/8/20 Bing Li sarah.lib...@gmail.com
 
  Hi, Eric et al
  Did you resolve this failure?
  I'm using Hive-0.11.0, and get the same error when access to HWI via
 browser.
 
  I already set the following properties in hive-site.xml
  - hive.hwi.listen.host
  - hive.hwi.listen.port
  - hive.hwi.war.file
 
  And copied two jasper jars into hive/lib:
  - jasper-compiler-5.5.23.jar
  - jasper-runtime-5.5.23.jar
 
  Thanks,
  - Bing
 
 
  2013/3/30 Eric Chu e...@rocketfuel.com
 
  Hi,
  I'm running Hive 0.10 and I want to support HWI (besides CLI and HUE).
 When I started HWI I didn't get any error. However, when I went to Hive
 Server Address:/hwi on my browser I saw the error below complaining
 about No Java compiler available. My JAVA_HOME is set
 to /usr/lib/jvm/java-1.6.0-sun-1.6.0.16.
  Besides https://cwiki.apache.org/Hive/hivewebinterface.html, there's not
 much documentation on HWI. I'm wondering if anyone else has seen this or
 has any idea about what's wrong?
  Thanks.
  Eric
 
  Problem accessing /hwi/. Reason:
 
  No Java compiler available
 
  Caused by:
 
  java.lang.IllegalStateException: No Java compiler available
  at

 org.apache.jasper.JspCompilationContext.createCompiler(JspCompilationContext.java:225)
  at

 org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:560)
  at

 org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:299)
  at
 org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:315)
  at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265)
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
  at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
  at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
  at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
  at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
  at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
  at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
  at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:327)
  at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:126)
  at
 org.mortbay.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:503)
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
  at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
  at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
  at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
  at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
  at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
  at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
  at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
  at

 org.mortbay.jetty.handler.RequestLogHandler.handle(RequestLogHandler.java:49)
  at org.mortbay.jetty.handler.



Re: [jira] [Created] (HIVE-5132) Can't access to hwi

2013-08-21 Thread Bing Li
Hi, Edward
The node running running hwi service uses a JDK.
Did you point to the node running the web browser?


[jira] [Updated] (HIVE-1511) Hive plan serialization is slow

2013-08-21 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-1511:
---

Attachment: HIVE-1511.8.patch

opParseCtxMap cannot be transient since we need this for the clone operation. 
v8 removes this and adds some debug code which I will comment about shortly.

 Hive plan serialization is slow
 ---

 Key: HIVE-1511
 URL: https://issues.apache.org/jira/browse/HIVE-1511
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Ning Zhang
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-1511.4.patch, HIVE-1511.5.patch, HIVE-1511.6.patch, 
 HIVE-1511.7.patch, HIVE-1511.8.patch, HIVE-1511.patch, HIVE-1511-wip2.patch, 
 HIVE-1511-wip3.patch, HIVE-1511-wip4.patch, HIVE-1511-wip.patch


 As reported by Edward Capriolo:
 For reference I did this as a test case
 SELECT * FROM src where
 key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
 OR key=0 OR key=0 OR key=0 OR
 key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
 OR key=0 OR key=0 OR key=0 OR
 ...(100 more of these)
 No OOM but I gave up after the test case did not go anywhere for about
 2 minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1511) Hive plan serialization is slow

2013-08-21 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746496#comment-13746496
 ] 

Brock Noland commented on HIVE-1511:


I am leaving for vacation this afternoon so I won't be able to help with this 
effort for over a week. I've been working with the test 
bucketsortoptimize_insert_2.q which fails with KryoException: Encountered 
unregistered class ID: 20 during clone of the query plan. All other 
serialization seems to work fine.  I added some debug code to the Utilities 
class to help debug this issue. It appears that either the data written out is 
corrupt or it gets confused on read. Below I have the trace logs to show it.

Here is the write logs, I have placed a comment where the write and read log 
start differing:

{noformat}
00:42 TRACE: [kryo] Write field: rowSchema 
(org.apache.hadoop.hive.ql.parse.RowResolver) pos=873
00:42 TRACE: [kryo] Write class name reference 21: 
org.apache.hadoop.hive.ql.exec.RowSchema
00:42 TRACE: [kryo] setGenerics
00:42 TRACE: [kryo] Write initial object reference 1014: _col0: int_col1: 
string_col6: string)
00:42 DEBUG: [kryo] Write: _col0: int_col1: string_col6: string)
00:42 TRACE: [kryo] FieldSerializer.write fields of class 
org.apache.hadoop.hive.ql.exec.RowSchema
00:42 TRACE: [kryo] Write field: signature 
(org.apache.hadoop.hive.ql.exec.RowSchema) pos=876
00:42 TRACE: [kryo] Write class name reference 9: java.util.ArrayList
00:42 DEBUG: [kryo] Write object reference 625: [_col0: int, _col1: string, 
_col6: string]
00:42 TRACE: [kryo] Write field: rslvMap 
(org.apache.hadoop.hive.ql.parse.RowResolver) pos=880
00:42 TRACE: [kryo] Write class name reference 30: java.util.HashMap
00:42 TRACE: [kryo] Write initial object reference 1015: {b={value=_col6: 
string}, a={key=_col0: int, value=_col1: string}}
00:42 DEBUG: [kryo] Write: {b={value=_col6: string}, a={key=_col0: int, 
value=_col1: string}}
00:42 DEBUG: [kryo] Write object reference 436: b
00:42 TRACE: [kryo] Write class name reference 1: java.util.LinkedHashMap
00:42 TRACE: [kryo] Write initial object reference 1016: {value=_col6: string}
00:42 DEBUG: [kryo] Write: {value=_col6: string}
00:42 DEBUG: [kryo] Write object reference 479: value
# Here is where it it gets confused on the read side ###
00:42 DEBUG: [kryo] Write object reference 628: _col6: string
00:42 DEBUG: [kryo] Write object reference 429: a
00:42 TRACE: [kryo] Write class name reference 1: java.util.LinkedHashMap
00:42 TRACE: [kryo] Write initial object reference 1017: {key=_col0: int, 
value=_col1: string}
00:42 DEBUG: [kryo] Write: {key=_col0: int, value=_col1: string}
00:42 TRACE: [kryo] Write class 1: String
00:42 DEBUG: [kryo] Write object reference 477: key
00:42 TRACE: [kryo] Write class name reference 22: 
org.apache.hadoop.hive.ql.exec.ColumnInfo
00:42 DEBUG: [kryo] Write object reference 626: _col0: int
00:42 TRACE: [kryo] Write class 1: String
00:42 DEBUG: [kryo] Write object reference 479: value
00:42 TRACE: [kryo] Write class name reference 22: 
org.apache.hadoop.hive.ql.exec.ColumnInfo
00:42 DEBUG: [kryo] Write object reference 627: _col1: string
00:42 TRACE: [kryo] Write class name reference 10: 
org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator
00:42 DEBUG: [kryo] Write object reference 372: MAPJOIN[12]
00:42 TRACE: [kryo] Write class name reference 43: 
org.apache.hadoop.hive.ql.parse.OpParseContext
{noformat}

As you can see above, after writing Write object reference 479: value Kryo 
writes Write object reference 628: _col6: string. Below we are able to read 
Read object reference 479: value but then it starts reading junk and fails 
shortly thereafter.

{noformat}
00:45 TRACE: [kryo] Read field: rowSchema 
(org.apache.hadoop.hive.ql.parse.RowResolver) pos=873
00:45 TRACE: [kryo] Read class name reference 21: 
org.apache.hadoop.hive.ql.exec.RowSchema
00:45 TRACE: [kryo] setGenerics
00:45 TRACE: [kryo] Read initial object reference 1014: 
org.apache.hadoop.hive.ql.exec.RowSchema
00:45 TRACE: [kryo] Read field: signature 
(org.apache.hadoop.hive.ql.exec.RowSchema) pos=876
00:45 TRACE: [kryo] Read class name reference 9: java.util.ArrayList
00:45 DEBUG: [kryo] Read object reference 625: [_col0: int, _col1: string, 
_col6: string]
00:45 DEBUG: [kryo] Read: _col0: int_col1: string_col6: string)
00:45 TRACE: [kryo] Read field: rslvMap 
(org.apache.hadoop.hive.ql.parse.RowResolver) pos=880
00:45 TRACE: [kryo] Read class name reference 30: java.util.HashMap
00:45 TRACE: [kryo] Read initial object reference 1015: java.util.HashMap
00:45 DEBUG: [kryo] Read object reference 436: b
00:45 TRACE: [kryo] Read class name reference 1: java.util.LinkedHashMap
00:45 TRACE: [kryo] Read initial object reference 1016: java.util.LinkedHashMap
00:45 DEBUG: [kryo] Read object reference 479: value
### Here it appears to get confused #
00:45 TRACE: [kryo] Read: -216
00:45 DEBUG: 

Re: Last time request for cwiki update privileges

2013-08-21 Thread Stephen Sprague
Sanjay gets some love after all! :)


On Tue, Aug 20, 2013 at 4:00 PM, Sanjay Subramanian 
sanjay.subraman...@wizecommerce.com wrote:

 Thanks Ashutosh

 From: Ashutosh Chauhan hashut...@apache.orgmailto:hashut...@apache.org
 Reply-To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Date: Tuesday, August 20, 2013 3:13 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org dev@hive.apache.org
 mailto:dev@hive.apache.org
 Subject: Re: Last time request for cwiki update privileges

 Hi Sanjay,

 Really sorry for that. I apologize for the delay. You are added now. Feel
 free to make changes to make Hive even better!

 Thanks,
 Ashutosh


 On Tue, Aug 20, 2013 at 2:39 PM, Sanjay Subramanian 
 sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com wrote:
 Hey guys

 I can only think of two reasons for my request is not yet accepted

 1. The admins don't want to give me access

 2. The admins have not seen my mail yet.

 This is the fourth and the LAST time I am requesting permission to edit
 wiki docs…Nobody likes being ignored and that includes me.

 Meanwhile to show my thankfulness to the Hive community I shall continue
 to answer questions .There will be no change in that behavior

 Regards

 sanjay




 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com
 Date: Wednesday, August 14, 2013 3:52 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org dev@hive.apache.org
 mailto:dev@hive.apache.org
 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

 Once again, I am down on my knees humbling calling upon the Hive Jedi
 Masters to please provide this paadwaan  with cwiki update privileges

 May the Force be with u

 Thanks

 sanjay

 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com
 Reply-To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Date: Wednesday, July 31, 2013 9:38 AM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org dev@hive.apache.org
 mailto:dev@hive.apache.org
 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

 Hi guys

 Any chance I could get cwiki update privileges today ?

 Thanks

 sanjay

 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com
 Date: Tuesday, July 30, 2013 4:26 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org dev@hive.apache.org
 mailto:dev@hive.apache.org
 Subject: Review Request (wikidoc): LZO Compression in Hive

 Hi

 Met with Lefty this afternoon and she was kind to spend time to add my
 documentation to the site - since I still don't have editing privileges :-)

 Please review the new wikidoc about LZO compression in the Hive language
 manual.  If anything is unclear or needs more information, you can email
 suggestions to this list or edit the wiki yourself (if you have editing
 privileges).  Here are the links:

   1.  Language Manual
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual (new
 bullet under File Formats)
   2.  LZO Compression
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO
   3.  CREATE TABLE
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
 (near end of section, pasted in here:)
 Use STORED AS TEXTFILE if the data needs to be stored as plain text files.
 Use STORED AS SEQUENCEFILE if the data needs to be compressed. Please read
 more about CompressedStorage
 https://cwiki.apache.org/confluence/display/Hive/CompressedStorage if
 you are planning to keep data compressed in your Hive tables. Use
 INPUTFORMAT and OUTPUTFORMAT to specify the name of a corresponding
 InputFormat and OutputFormat class as a string literal, e.g.,
 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.
 For LZO compression, the values to use are 'INPUTFORMAT
 com.hadoop.mapred.DeprecatedLzoTextInputFormat OUTPUTFORMAT
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' (see LZO
 Compression
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO).

 My cwiki id is
 https://cwiki.apache.org/confluence/display/~sanjaysubraman...@yahoo.com
 It will be great if I could get edit privileges

 Thanks
 sanjay

 CONFIDENTIALITY NOTICE
 ==
 This email message and any attachments are for the exclusive use of the
 intended recipient(s) and may contain confidential and privileged
 information. Any unauthorized review, use, disclosure 

[jira] [Commented] (HIVE-4963) Support in memory PTF partitions

2013-08-21 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746530#comment-13746530
 ] 

Phabricator commented on HIVE-4963:
---

ashutoshc has commented on the revision HIVE-4963 [jira] Support in memory PTF 
partitions.

  Seems like there are more opportunities to make this efficient, but those can 
be digged into later. This patch is a step in a right direction by reusing 
existing infra. Any improvements we now make may benefit other spilling 
operators like join too. Really makes me happy : )
  Apart from code comments, I will also request you to add a testcase which 
sets the config value (cachesize) to zero, so that it spills for every record 
and exercise all these new codepath.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/exec/PTFPartition.java:89 This I think 
you need to do because current RowContainers can only hold crisp java objects. 
Seems like we can improve this by writing RowContainer which can hold 
writables, thus avoiding unnecessary deserialization and mem-cpy here. 
Something worth exploring as follow-up issue.
  ql/src/java/org/apache/hadoop/hive/ql/exec/PTFPartition.java:57 this config 
should really govern how much memory we are willing to allocate (in bytes), not 
in number of rows, but thats a topic for another jira since you are reusing 
existing code.
  ql/src/java/org/apache/hadoop/hive/ql/exec/PTFPartition.java:148 This sanity 
check is in tight loop. Ideally we should not have such checks in inner loop. 
But lets leave it here till we get more confidence in the code. Will be good to 
add a note about what will be the assumption if we are to get rid of this check 
in future.
  ql/src/java/org/apache/hadoop/hive/ql/exec/PTFPartition.java:137 Instead of 
try-catch-rethrow, shall we just add throws in method signature, makes code 
readable and arguably faster.
  ql/src/java/org/apache/hadoop/hive/ql/exec/PTFPartition.java:160 Similar 
comment about try-catch-rethrow.
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/PTFRowContainer.java:80 
Awesome comments!
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/PTFRowContainer.java:94 
If I get this right, this function will again do serialization before spilling, 
so in case of memory pressure, we are doing a  round trip of ser-deser without 
performing useful work. This ties back to my earlier comment on eager 
deserialization.
  This whole mechanism is worth exploring later.

REVISION DETAIL
  https://reviews.facebook.net/D12279

To: JIRA, ashutoshc, hbutani


 Support in memory PTF partitions
 

 Key: HIVE-4963
 URL: https://issues.apache.org/jira/browse/HIVE-4963
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
 Attachments: HIVE-4963.D11955.1.patch, HIVE-4963.D12279.1.patch, 
 HIVE-4963.D12279.2.patch, PTFRowContainer.patch


 PTF partitions apply the defensive mode of assuming that partitions will not 
 fit in memory. Because of this there is a significant deserialization 
 overhead when accessing elements. 
 Allow the user to specify that there is enough memory to hold partitions 
 through a 'hive.ptf.partition.fits.in.mem' option.  
 Savings depends on partition size and in case of windowing the number of 
 UDAFs and the window ranges. For eg for the following (admittedly extreme) 
 case the PTFOperator exec times went from 39 secs to 8 secs.
  
 {noformat}
 select t, s, i, b, f, d,
 min(t) over(partition by 1 rows between unbounded preceding and current row), 
 min(s) over(partition by 1 rows between unbounded preceding and current row), 
 min(i) over(partition by 1 rows between unbounded preceding and current row), 
 min(b) over(partition by 1 rows between unbounded preceding and current row) 
 from over10k
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4963) Support in memory PTF partitions

2013-08-21 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-4963:
--

Issue Type: New Feature  (was: Bug)

 Support in memory PTF partitions
 

 Key: HIVE-4963
 URL: https://issues.apache.org/jira/browse/HIVE-4963
 Project: Hive
  Issue Type: New Feature
  Components: PTF-Windowing
Reporter: Harish Butani
 Attachments: HIVE-4963.D11955.1.patch, HIVE-4963.D12279.1.patch, 
 HIVE-4963.D12279.2.patch, PTFRowContainer.patch


 PTF partitions apply the defensive mode of assuming that partitions will not 
 fit in memory. Because of this there is a significant deserialization 
 overhead when accessing elements. 
 Allow the user to specify that there is enough memory to hold partitions 
 through a 'hive.ptf.partition.fits.in.mem' option.  
 Savings depends on partition size and in case of windowing the number of 
 UDAFs and the window ranges. For eg for the following (admittedly extreme) 
 case the PTFOperator exec times went from 39 secs to 8 secs.
  
 {noformat}
 select t, s, i, b, f, d,
 min(t) over(partition by 1 rows between unbounded preceding and current row), 
 min(s) over(partition by 1 rows between unbounded preceding and current row), 
 min(i) over(partition by 1 rows between unbounded preceding and current row), 
 min(b) over(partition by 1 rows between unbounded preceding and current row) 
 from over10k
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4963) Support in memory PTF partitions

2013-08-21 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746535#comment-13746535
 ] 

Edward Capriolo commented on HIVE-4963:
---

{quote}
ql/src/java/org/apache/hadoop/hive/ql/exec/PTFPartition.java:89 This I think 
you need to do because current RowContainers can only hold crisp java objects. 
Seems like we can improve this by writing RowContainer which can hold 
writables, thus avoiding unnecessary deserialization and mem-cpy here. 
Something worth exploring as follow-up issue.
{quote}
Is it much more work to do this now? There are already a number of PTF 
-to-be-cleaned-ups and I would hate to add more.

 Support in memory PTF partitions
 

 Key: HIVE-4963
 URL: https://issues.apache.org/jira/browse/HIVE-4963
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
 Attachments: HIVE-4963.D11955.1.patch, HIVE-4963.D12279.1.patch, 
 HIVE-4963.D12279.2.patch, PTFRowContainer.patch


 PTF partitions apply the defensive mode of assuming that partitions will not 
 fit in memory. Because of this there is a significant deserialization 
 overhead when accessing elements. 
 Allow the user to specify that there is enough memory to hold partitions 
 through a 'hive.ptf.partition.fits.in.mem' option.  
 Savings depends on partition size and in case of windowing the number of 
 UDAFs and the window ranges. For eg for the following (admittedly extreme) 
 case the PTFOperator exec times went from 39 secs to 8 secs.
  
 {noformat}
 select t, s, i, b, f, d,
 min(t) over(partition by 1 rows between unbounded preceding and current row), 
 min(s) over(partition by 1 rows between unbounded preceding and current row), 
 min(i) over(partition by 1 rows between unbounded preceding and current row), 
 min(b) over(partition by 1 rows between unbounded preceding and current row) 
 from over10k
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4963) Support in memory PTF partitions

2013-08-21 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746554#comment-13746554
 ] 

Ashutosh Chauhan commented on HIVE-4963:


Yes this is much more work to do. More importantly, its not PTF specific 
either, its in existing code which Harish has chosen to reuse. I dont think its 
fair to hold on to this patch for this. It can be done in a follow-up.

 Support in memory PTF partitions
 

 Key: HIVE-4963
 URL: https://issues.apache.org/jira/browse/HIVE-4963
 Project: Hive
  Issue Type: New Feature
  Components: PTF-Windowing
Reporter: Harish Butani
 Attachments: HIVE-4963.D11955.1.patch, HIVE-4963.D12279.1.patch, 
 HIVE-4963.D12279.2.patch, PTFRowContainer.patch


 PTF partitions apply the defensive mode of assuming that partitions will not 
 fit in memory. Because of this there is a significant deserialization 
 overhead when accessing elements. 
 Allow the user to specify that there is enough memory to hold partitions 
 through a 'hive.ptf.partition.fits.in.mem' option.  
 Savings depends on partition size and in case of windowing the number of 
 UDAFs and the window ranges. For eg for the following (admittedly extreme) 
 case the PTFOperator exec times went from 39 secs to 8 secs.
  
 {noformat}
 select t, s, i, b, f, d,
 min(t) over(partition by 1 rows between unbounded preceding and current row), 
 min(s) over(partition by 1 rows between unbounded preceding and current row), 
 min(i) over(partition by 1 rows between unbounded preceding and current row), 
 min(b) over(partition by 1 rows between unbounded preceding and current row) 
 from over10k
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4963) Support in memory PTF partitions

2013-08-21 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746557#comment-13746557
 ] 

Ashutosh Chauhan commented on HIVE-4963:


Harish, Also can you get rid of config variables in HiveConf which were about 
size of persistence byte list, those will become relevant after this patch. 
Also, do you think we can word title of this jira better so it helps folks to 
understand this work better.

 Support in memory PTF partitions
 

 Key: HIVE-4963
 URL: https://issues.apache.org/jira/browse/HIVE-4963
 Project: Hive
  Issue Type: New Feature
  Components: PTF-Windowing
Reporter: Harish Butani
 Attachments: HIVE-4963.D11955.1.patch, HIVE-4963.D12279.1.patch, 
 HIVE-4963.D12279.2.patch, PTFRowContainer.patch


 PTF partitions apply the defensive mode of assuming that partitions will not 
 fit in memory. Because of this there is a significant deserialization 
 overhead when accessing elements. 
 Allow the user to specify that there is enough memory to hold partitions 
 through a 'hive.ptf.partition.fits.in.mem' option.  
 Savings depends on partition size and in case of windowing the number of 
 UDAFs and the window ranges. For eg for the following (admittedly extreme) 
 case the PTFOperator exec times went from 39 secs to 8 secs.
  
 {noformat}
 select t, s, i, b, f, d,
 min(t) over(partition by 1 rows between unbounded preceding and current row), 
 min(s) over(partition by 1 rows between unbounded preceding and current row), 
 min(i) over(partition by 1 rows between unbounded preceding and current row), 
 min(b) over(partition by 1 rows between unbounded preceding and current row) 
 from over10k
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4963) Support in memory PTF partitions

2013-08-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4963:
---

Assignee: Harish Butani

 Support in memory PTF partitions
 

 Key: HIVE-4963
 URL: https://issues.apache.org/jira/browse/HIVE-4963
 Project: Hive
  Issue Type: New Feature
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-4963.D11955.1.patch, HIVE-4963.D12279.1.patch, 
 HIVE-4963.D12279.2.patch, PTFRowContainer.patch


 PTF partitions apply the defensive mode of assuming that partitions will not 
 fit in memory. Because of this there is a significant deserialization 
 overhead when accessing elements. 
 Allow the user to specify that there is enough memory to hold partitions 
 through a 'hive.ptf.partition.fits.in.mem' option.  
 Savings depends on partition size and in case of windowing the number of 
 UDAFs and the window ranges. For eg for the following (admittedly extreme) 
 case the PTFOperator exec times went from 39 secs to 8 secs.
  
 {noformat}
 select t, s, i, b, f, d,
 min(t) over(partition by 1 rows between unbounded preceding and current row), 
 min(s) over(partition by 1 rows between unbounded preceding and current row), 
 min(i) over(partition by 1 rows between unbounded preceding and current row), 
 min(b) over(partition by 1 rows between unbounded preceding and current row) 
 from over10k
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


HIVE-4568 Review Request

2013-08-21 Thread Xuefu Zhang
Hi all,

The patch for this JIRA has been pending for quite some time. Carl
expressed interest in reviewing it, but it seems he is not available to do
so. I'm wondering if any other committer can help on this. Thanks a lot.

https://reviews.apache.org/r/11334/

Regards,
Xuefu


[jira] [Commented] (HIVE-4963) Support in memory PTF partitions

2013-08-21 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746617#comment-13746617
 ] 

Edward Capriolo commented on HIVE-4963:
---

{quote}
Yes this is much more work to do. More importantly, its not PTF specific 
either, its in existing code which Harish has chosen to reuse. I dont think its 
fair to hold on to this patch for this. It can be done in a follow-up.
{quote}
Agreed. If extending an existing component that already does it this way, 
changing both is out-of-scope.

 Support in memory PTF partitions
 

 Key: HIVE-4963
 URL: https://issues.apache.org/jira/browse/HIVE-4963
 Project: Hive
  Issue Type: New Feature
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-4963.D11955.1.patch, HIVE-4963.D12279.1.patch, 
 HIVE-4963.D12279.2.patch, PTFRowContainer.patch


 PTF partitions apply the defensive mode of assuming that partitions will not 
 fit in memory. Because of this there is a significant deserialization 
 overhead when accessing elements. 
 Allow the user to specify that there is enough memory to hold partitions 
 through a 'hive.ptf.partition.fits.in.mem' option.  
 Savings depends on partition size and in case of windowing the number of 
 UDAFs and the window ranges. For eg for the following (admittedly extreme) 
 case the PTFOperator exec times went from 39 secs to 8 secs.
  
 {noformat}
 select t, s, i, b, f, d,
 min(t) over(partition by 1 rows between unbounded preceding and current row), 
 min(s) over(partition by 1 rows between unbounded preceding and current row), 
 min(i) over(partition by 1 rows between unbounded preceding and current row), 
 min(b) over(partition by 1 rows between unbounded preceding and current row) 
 from over10k
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5107) Change hive's build to maven

2013-08-21 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746626#comment-13746626
 ] 

Edward Capriolo commented on HIVE-5107:
---

I am hitting a weird blocker now with antlr generation of the ql/hive-exec 
project. My antlr+plugin combination was able to generate the hive-metastore .g 
files ok but is having issues with HiveLexer.g and HiveParser.g. That is my 
biggest blocker at the moment. If the issue keeps up I may switch to exec for 
the time being. 



 Change hive's build to maven
 

 Key: HIVE-5107
 URL: https://issues.apache.org/jira/browse/HIVE-5107
 Project: Hive
  Issue Type: Task
Reporter: Edward Capriolo
Assignee: Edward Capriolo

 I can not cope with hive's build infrastructure any more. I have started 
 working on porting the project to maven. When I have some solid progess i 
 will github the entire thing for review. Then we can talk about switching the 
 project somehow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: HIVE-4568 Review Request

2013-08-21 Thread Edward Capriolo
I will look at it.


On Wed, Aug 21, 2013 at 1:05 PM, Xuefu Zhang xzh...@cloudera.com wrote:

 Hi all,

 The patch for this JIRA has been pending for quite some time. Carl
 expressed interest in reviewing it, but it seems he is not available to do
 so. I'm wondering if any other committer can help on this. Thanks a lot.

 https://reviews.apache.org/r/11334/

 Regards,
 Xuefu



Re: HIVE-4568 Review Request

2013-08-21 Thread Xuefu Zhang
Thank you, Edward!

--Xuefu


On Wed, Aug 21, 2013 at 10:46 AM, Edward Capriolo edlinuxg...@gmail.comwrote:

 I will look at it.


 On Wed, Aug 21, 2013 at 1:05 PM, Xuefu Zhang xzh...@cloudera.com wrote:

  Hi all,
 
  The patch for this JIRA has been pending for quite some time. Carl
  expressed interest in reviewing it, but it seems he is not available to
 do
  so. I'm wondering if any other committer can help on this. Thanks a lot.
 
  https://reviews.apache.org/r/11334/
 
  Regards,
  Xuefu
 



[jira] [Updated] (HIVE-5133) webhcat jobs that need to access metastore fails in secure mode

2013-08-21 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5133:


Attachment: HIVE-5133.1.patch

 webhcat jobs that need to access metastore fails in secure mode
 ---

 Key: HIVE-5133
 URL: https://issues.apache.org/jira/browse/HIVE-5133
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-5133.1.patch


 Webhcat job submission requests result in the pig/hive/mr job being run from 
 a map task that it launches. In secure mode, for the pig/hive/mr job that is 
 run to be authorized to perform actions on metastore, it has to have the 
 delegation tokens from the hive metastore.
 In case of pig/MR job this is needed if hcatalog is being used in the 
 script/job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5104) HCatStorer fails to store boolean type

2013-08-21 Thread Karl D. Gierach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl D. Gierach updated HIVE-5104:
--

Reproduced In: 0.11.0
   Status: Patch Available  (was: Open)

code modified:

1) 
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/HCatBaseStorer.java
  a) getHCatFSFromPigFS(...)
  b) getJavaObj(...)

2) 
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hcatalog/pig/TestHCatStorer.java
  a) testStoreFuncAllSimpleTypes()


 HCatStorer fails to store boolean type
 --

 Key: HIVE-5104
 URL: https://issues.apache.org/jira/browse/HIVE-5104
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Ron Frohock

 Unable to store boolean values to HCat table 
 Assume in Hive you have two tables...
 CREATE TABLE btest(test as boolean);
 CREATE TABLE btest2(test as boolean);
 Then in Pig 
 A = LOAD 'btest' USING org.apache.hcatalog.pig.HCatLoader();
 STORE A INTO 'btest2' USING org.apache.hcatalog.pig.HCatStorer();
 You will get an ERROR 115: Unsupported type 5: in Pig's Schema  
 Checking HCatBaseStorer.java, the case for data types doesn't check for 
 booleans.  Might have been overlooked in adding boolean to Pig in 0.10

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5104) HCatStorer fails to store boolean type

2013-08-21 Thread Karl D. Gierach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl D. Gierach updated HIVE-5104:
--

Attachment: HIVE-5104.patch

the patch.

based off github's branch-0.11 branch.

 HCatStorer fails to store boolean type
 --

 Key: HIVE-5104
 URL: https://issues.apache.org/jira/browse/HIVE-5104
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Ron Frohock
 Attachments: HIVE-5104.patch


 Unable to store boolean values to HCat table 
 Assume in Hive you have two tables...
 CREATE TABLE btest(test as boolean);
 CREATE TABLE btest2(test as boolean);
 Then in Pig 
 A = LOAD 'btest' USING org.apache.hcatalog.pig.HCatLoader();
 STORE A INTO 'btest2' USING org.apache.hcatalog.pig.HCatStorer();
 You will get an ERROR 115: Unsupported type 5: in Pig's Schema  
 Checking HCatBaseStorer.java, the case for data types doesn't check for 
 booleans.  Might have been overlooked in adding boolean to Pig in 0.10

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1511) Hive plan serialization is slow

2013-08-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746695#comment-13746695
 ] 

Hive QA commented on HIVE-1511:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12599211/HIVE-1511.8.patch

{color:red}ERROR:{color} -1 due to 415 failed/errored test(s), 2895 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_named_struct
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_indexes_edge_cases
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_map_keys
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quote2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_covar_pop
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input18
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_date_4
org.apache.hive.jdbc.TestJdbcDriver2.testResultSetMetaData
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_escape
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_find_in_set
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_compression
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_degrees
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_xpath_int
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_genericudaf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_case_thrift
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_num_op_type_conv
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_second
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_joins
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compression
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nested_complex
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_rc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part14
org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_lateralview
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_unix_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_binary
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_to_unix_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_pushdown2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_field
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_virtual_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf_general_queries
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_min
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_empty_files
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_div
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_map
org.apache.hadoop.hive.ql.parse.TestParse.testParse_cast1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_literal_string
org.apache.hive.jdbc.TestJdbcDriver2.testNullType
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_substr
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_dynamicserde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_context_ngrams

[jira] [Commented] (HIVE-2599) Support Composit/Compound Keys with HBaseStorageHandler

2013-08-21 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746743#comment-13746743
 ] 

Swarnim Kulkarni commented on HIVE-2599:


This patch has been available for quite sometime and also passes all tests on 
Hive QA. If someone gets a chance to review this, I will really appreciate that.

 Support Composit/Compound Keys with HBaseStorageHandler
 ---

 Key: HIVE-2599
 URL: https://issues.apache.org/jira/browse/HIVE-2599
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.8.0
Reporter: Hans Uhlig
Assignee: Swarnim Kulkarni
 Attachments: HIVE-2599.1.patch.txt, HIVE-2599.2.patch.txt, 
 HIVE-2599.2.patch.txt


 It would be really nice for hive to be able to understand composite keys from 
 an underlying HBase schema. Currently we have to store key fields twice to be 
 able to both key and make data available. I noticed John Sichi mentioned in 
 HIVE-1228 that this would be a separate issue but I cant find any follow up. 
 How feasible is this in the HBaseStorageHandler?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3591) set hive.security.authorization.enabled can be executed by any user

2013-08-21 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746744#comment-13746744
 ] 

Larry McCay commented on HIVE-3591:
---

What is the current status/thinking on this issue? Is it something that we 
should be addressing and are there any thoughts on how it should be 
prevented/restricted, etc?

 set hive.security.authorization.enabled can be executed by any user
 ---

 Key: HIVE-3591
 URL: https://issues.apache.org/jira/browse/HIVE-3591
 Project: Hive
  Issue Type: Bug
  Components: Authorization, CLI, Clients, JDBC
Affects Versions: 0.7.1
 Environment: RHEL 5.6
 CDH U3
Reporter: Dev Gupta
  Labels: Authorization, Security

 The property hive.security.authorization.enabled can be set to true or false, 
 by any user on the CLI, thus circumventing any previously set grants and 
 authorizations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [ANNOUNCE] New Hive Committer - Thejas Nair

2013-08-21 Thread Lefty Leverenz
Way to go, Thejas!

To celebrate, I'm converting your WebHCat manual to wikidocs.

-- Lefty


[jira] [Commented] (HIVE-5112) Upgrade protobuf to 2.5 from 2.4

2013-08-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746801#comment-13746801
 ] 

Hive QA commented on HIVE-5112:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12599001/HIVE-5112.D12429.1.patch

{color:green}SUCCESS:{color} +1 2895 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/496/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/496/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 Upgrade protobuf to 2.5 from 2.4
 

 Key: HIVE-5112
 URL: https://issues.apache.org/jira/browse/HIVE-5112
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Owen O'Malley
 Attachments: HIVE-5112.D12429.1.patch


 Hadoop and Hbase have both upgraded protobuf. We should as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Last time request for cwiki update privileges

2013-08-21 Thread Mikhail Antonov
Can I also get the edit privilege for wiki please?

I'd like to add some details about LDAP authentication..

Mikhail


2013/8/21 Stephen Sprague sprag...@gmail.com

 Sanjay gets some love after all! :)


 On Tue, Aug 20, 2013 at 4:00 PM, Sanjay Subramanian 
 sanjay.subraman...@wizecommerce.com wrote:

 Thanks Ashutosh

 From: Ashutosh Chauhan hashut...@apache.orgmailto:hashut...@apache.org
 
 Reply-To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org

 Date: Tuesday, August 20, 2013 3:13 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org

 Subject: Re: Last time request for cwiki update privileges

 Hi Sanjay,

 Really sorry for that. I apologize for the delay. You are added now. Feel
 free to make changes to make Hive even better!

 Thanks,
 Ashutosh


 On Tue, Aug 20, 2013 at 2:39 PM, Sanjay Subramanian 
 sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com wrote:
 Hey guys

 I can only think of two reasons for my request is not yet accepted

 1. The admins don't want to give me access

 2. The admins have not seen my mail yet.

 This is the fourth and the LAST time I am requesting permission to edit
 wiki docs…Nobody likes being ignored and that includes me.

 Meanwhile to show my thankfulness to the Hive community I shall continue
 to answer questions .There will be no change in that behavior

 Regards

 sanjay




 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com

 Date: Wednesday, August 14, 2013 3:52 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org

 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

 Once again, I am down on my knees humbling calling upon the Hive Jedi
 Masters to please provide this paadwaan  with cwiki update privileges

 May the Force be with u

 Thanks

 sanjay

 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com
 Reply-To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org

 Date: Wednesday, July 31, 2013 9:38 AM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org

 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

 Hi guys

 Any chance I could get cwiki update privileges today ?

 Thanks

 sanjay

 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com

 Date: Tuesday, July 30, 2013 4:26 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org
 Subject: Review Request (wikidoc): LZO Compression in Hive

 Hi

 Met with Lefty this afternoon and she was kind to spend time to add my
 documentation to the site - since I still don't have editing privileges :-)

 Please review the new wikidoc about LZO compression in the Hive language
 manual.  If anything is unclear or needs more information, you can email
 suggestions to this list or edit the wiki yourself (if you have editing
 privileges).  Here are the links:

   1.  Language Manual
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual (new
 bullet under File Formats)
   2.  LZO Compression
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO
   3.  CREATE TABLE
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
 (near end of section, pasted in here:)
 Use STORED AS TEXTFILE if the data needs to be stored as plain text
 files. Use STORED AS SEQUENCEFILE if the data needs to be compressed.
 Please read more about CompressedStorage
 https://cwiki.apache.org/confluence/display/Hive/CompressedStorage if
 you are planning to keep data compressed in your Hive tables. Use
 INPUTFORMAT and OUTPUTFORMAT to specify the name of a corresponding
 InputFormat and OutputFormat class as a string literal, e.g.,
 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.
 For LZO compression, the values to use are 'INPUTFORMAT
 com.hadoop.mapred.DeprecatedLzoTextInputFormat OUTPUTFORMAT
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' (see LZO
 Compression
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO).


 My cwiki id is
 https://cwiki.apache.org/confluence/display/~sanjaysubraman...@yahoo.com
 It will be great if I could get edit privileges

 Thanks
 sanjay

 CONFIDENTIALITY NOTICE
 ==
 This email 

LIKE filter pushdown for tables and partitions

2013-08-21 Thread Sergey Shelukhin
Hi.

I think there are issues with the way hive can currently do LIKE
operator JDO pushdown and it the code should be removed for partitions
and tables.
Are there objections to removing LIKE from Filter.g and related areas?
If no I will file a JIRA and do it.

Details:
There's code in metastore that is capable of pushing down LIKE
expression into JDO for string partition keys, as well as tables.
The code for tables doesn't appear used, and partition code definitely
doesn't run in Hive proper because metastore client doesn't send LIKE
expressions to server. It may be used in e.g. HCat and other places,
but after asking some people here, I found out it probably isn't.
I was trying to make it run and noticed some problems:
1) For partitions, Hive sends SQL patterns in a filter for like, e.g.
%foo%, whereas metastore passes them into matches() JDOQL method
which expects Java regex.
2) Converting the pattern to Java regex via UDFLike method, I found
out that not all regexes appear to work in DN. .*foo seems to work
but anything complex (such as escaping the pattern using
Pattern.quote, which UDFLike does) breaks and no longer matches
properly.
3) I tried to implement common cases using JDO methods
startsWith/endsWith/indexOf (I will file a JIRA), but when I run tests
on Derby, they also appear to have problems with some strings (for
example, partition with backslash in the name cannot be matched by
LIKE %\% (single backslash in a string), after being converted to
.indexOf(param) where param is \ (escaping the backslash once again
doesn't work either, and anyway there's no documented reason why it
shouldn't work properly), while other characters match correctly, even
e.g. %.

For tables, there's no SQL-like, it expects Java regex, but I am not
convinced all Java regexes are going to work.

So, I think that for future correctness sake it's better to remove this code.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (HIVE-5112) Upgrade protobuf to 2.5 from 2.4

2013-08-21 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746868#comment-13746868
 ] 

Gunther Hagleitner commented on HIVE-5112:
--

I believe updating the dependency and leaving the generated code won't work. 
The lib isn't backwards compatible wrt to old generated code. I think the only 
real guarantee around compat is on the wire protocol.

With that: I think waiting until 2.1.0-beta and switching to that with proto 
2.5 at the same time is still the best option.

 Upgrade protobuf to 2.5 from 2.4
 

 Key: HIVE-5112
 URL: https://issues.apache.org/jira/browse/HIVE-5112
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Owen O'Malley
 Attachments: HIVE-5112.D12429.1.patch


 Hadoop and Hbase have both upgraded protobuf. We should as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Last time request for cwiki update privileges

2013-08-21 Thread Ashutosh Chauhan
Hey Mikhail,

Sure. Whats ur cwiki id?

Thanks,
Ashutosh


On Wed, Aug 21, 2013 at 1:58 PM, Mikhail Antonov olorinb...@gmail.comwrote:

 Can I also get the edit privilege for wiki please?

 I'd like to add some details about LDAP authentication..

 Mikhail


 2013/8/21 Stephen Sprague sprag...@gmail.com

 Sanjay gets some love after all! :)


 On Tue, Aug 20, 2013 at 4:00 PM, Sanjay Subramanian 
 sanjay.subraman...@wizecommerce.com wrote:

 Thanks Ashutosh

 From: Ashutosh Chauhan hashut...@apache.orgmailto:hashut...@apache.org
 
 Reply-To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org

 Date: Tuesday, August 20, 2013 3:13 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org

 Subject: Re: Last time request for cwiki update privileges

 Hi Sanjay,

 Really sorry for that. I apologize for the delay. You are added now.
 Feel free to make changes to make Hive even better!

 Thanks,
 Ashutosh


 On Tue, Aug 20, 2013 at 2:39 PM, Sanjay Subramanian 
 sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com wrote:
 Hey guys

 I can only think of two reasons for my request is not yet accepted

 1. The admins don't want to give me access

 2. The admins have not seen my mail yet.

 This is the fourth and the LAST time I am requesting permission to edit
 wiki docs…Nobody likes being ignored and that includes me.

 Meanwhile to show my thankfulness to the Hive community I shall continue
 to answer questions .There will be no change in that behavior

 Regards

 sanjay




 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com

 Date: Wednesday, August 14, 2013 3:52 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org

 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

 Once again, I am down on my knees humbling calling upon the Hive Jedi
 Masters to please provide this paadwaan  with cwiki update privileges

 May the Force be with u

 Thanks

 sanjay

 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com
 Reply-To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org

 Date: Wednesday, July 31, 2013 9:38 AM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org

 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

 Hi guys

 Any chance I could get cwiki update privileges today ?

 Thanks

 sanjay

 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com

 Date: Tuesday, July 30, 2013 4:26 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org
 Subject: Review Request (wikidoc): LZO Compression in Hive

 Hi

 Met with Lefty this afternoon and she was kind to spend time to add my
 documentation to the site - since I still don't have editing privileges :-)

 Please review the new wikidoc about LZO compression in the Hive language
 manual.  If anything is unclear or needs more information, you can email
 suggestions to this list or edit the wiki yourself (if you have editing
 privileges).  Here are the links:

   1.  Language Manual
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual (new
 bullet under File Formats)
   2.  LZO Compression
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO
   3.  CREATE TABLE
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
 (near end of section, pasted in here:)
 Use STORED AS TEXTFILE if the data needs to be stored as plain text
 files. Use STORED AS SEQUENCEFILE if the data needs to be compressed.
 Please read more about CompressedStorage
 https://cwiki.apache.org/confluence/display/Hive/CompressedStorage if
 you are planning to keep data compressed in your Hive tables. Use
 INPUTFORMAT and OUTPUTFORMAT to specify the name of a corresponding
 InputFormat and OutputFormat class as a string literal, e.g.,
 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.
 For LZO compression, the values to use are 'INPUTFORMAT
 com.hadoop.mapred.DeprecatedLzoTextInputFormat OUTPUTFORMAT
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' (see LZO
 Compression
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO).


 My cwiki id is
 

Re: Last time request for cwiki update privileges

2013-08-21 Thread Mikhail Antonov
mantonov


2013/8/21 Ashutosh Chauhan hashut...@apache.org

 Hey Mikhail,

 Sure. Whats ur cwiki id?

 Thanks,
 Ashutosh


 On Wed, Aug 21, 2013 at 1:58 PM, Mikhail Antonov olorinb...@gmail.comwrote:

 Can I also get the edit privilege for wiki please?

 I'd like to add some details about LDAP authentication..

 Mikhail


 2013/8/21 Stephen Sprague sprag...@gmail.com

 Sanjay gets some love after all! :)


 On Tue, Aug 20, 2013 at 4:00 PM, Sanjay Subramanian 
 sanjay.subraman...@wizecommerce.com wrote:

 Thanks Ashutosh

 From: Ashutosh Chauhan hashut...@apache.orgmailto:
 hashut...@apache.org
 Reply-To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org

 Date: Tuesday, August 20, 2013 3:13 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org

 Subject: Re: Last time request for cwiki update privileges

 Hi Sanjay,

 Really sorry for that. I apologize for the delay. You are added now.
 Feel free to make changes to make Hive even better!

 Thanks,
 Ashutosh


 On Tue, Aug 20, 2013 at 2:39 PM, Sanjay Subramanian 
 sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com wrote:
 Hey guys

 I can only think of two reasons for my request is not yet accepted

 1. The admins don't want to give me access

 2. The admins have not seen my mail yet.

 This is the fourth and the LAST time I am requesting permission to edit
 wiki docs…Nobody likes being ignored and that includes me.

 Meanwhile to show my thankfulness to the Hive community I shall
 continue to answer questions .There will be no change in that behavior

 Regards

 sanjay




 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com

 Date: Wednesday, August 14, 2013 3:52 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org

 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

 Once again, I am down on my knees humbling calling upon the Hive Jedi
 Masters to please provide this paadwaan  with cwiki update privileges

 May the Force be with u

 Thanks

 sanjay

 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com
 Reply-To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org

 Date: Wednesday, July 31, 2013 9:38 AM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org

 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

 Hi guys

 Any chance I could get cwiki update privileges today ?

 Thanks

 sanjay

 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com

 Date: Tuesday, July 30, 2013 4:26 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org
 Subject: Review Request (wikidoc): LZO Compression in Hive

 Hi

 Met with Lefty this afternoon and she was kind to spend time to add my
 documentation to the site - since I still don't have editing privileges :-)

 Please review the new wikidoc about LZO compression in the Hive
 language manual.  If anything is unclear or needs more information, you can
 email suggestions to this list or edit the wiki yourself (if you have
 editing privileges).  Here are the links:

   1.  Language Manual
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual (new
 bullet under File Formats)
   2.  LZO Compression
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO
   3.  CREATE TABLE
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
 (near end of section, pasted in here:)
 Use STORED AS TEXTFILE if the data needs to be stored as plain text
 files. Use STORED AS SEQUENCEFILE if the data needs to be compressed.
 Please read more about CompressedStorage
 https://cwiki.apache.org/confluence/display/Hive/CompressedStorage if
 you are planning to keep data compressed in your Hive tables. Use
 INPUTFORMAT and OUTPUTFORMAT to specify the name of a corresponding
 InputFormat and OutputFormat class as a string literal, e.g.,
 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.
 For LZO compression, the values to use are 'INPUTFORMAT
 com.hadoop.mapred.DeprecatedLzoTextInputFormat OUTPUTFORMAT
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' (see LZO
 Compression
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO).


 My cwiki id is

Re: Last time request for cwiki update privileges

2013-08-21 Thread Ashutosh Chauhan
Not able to find this id in cwiki. Did you create an account on
cwiki.apache.org

On Wed, Aug 21, 2013 at 2:59 PM, Mikhail Antonov olorinb...@gmail.comwrote:

 mantonov


[jira] [Commented] (HIVE-4588) Support session level hooks for HiveServer2

2013-08-21 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746916#comment-13746916
 ] 

Mikhail Antonov commented on HIVE-4588:
---

Looks good! Any plans to backport on 0.11 anytime soon?

 Support session level hooks for HiveServer2
 ---

 Key: HIVE-4588
 URL: https://issues.apache.org/jira/browse/HIVE-4588
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4588-1.patch, HIVE-4588.3.patch


 Support session level hooks for HiveSrver2. The configured hooks will get 
 executed at beginning of each new session.
 This is useful for auditing connections, possibly tuning the session level 
 properties etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4588) Support session level hooks for HiveServer2

2013-08-21 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746922#comment-13746922
 ] 

Prasad Mujumdar commented on HIVE-4588:
---

[~navis] The patch is updated per your last suggestion. Would you like to take 
another look. Thanks!

 Support session level hooks for HiveServer2
 ---

 Key: HIVE-4588
 URL: https://issues.apache.org/jira/browse/HIVE-4588
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4588-1.patch, HIVE-4588.3.patch


 Support session level hooks for HiveSrver2. The configured hooks will get 
 executed at beginning of each new session.
 This is useful for auditing connections, possibly tuning the session level 
 properties etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3591) set hive.security.authorization.enabled can be executed by any user

2013-08-21 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746980#comment-13746980
 ] 

Thiruvel Thirumoolan commented on HIVE-3591:


[~lmccay] The first approach to authorization was client side. [~sushanth] has 
also enabled this on the server side (HCatalog/Metastore) through HIVE-3705.

We enable these features on our HCatalog deployments. Even if the user unsets 
these properties, server side changes still take effect and the user can't drop 
tables etc. We have tested this for HDFS based authorization. The properties we 
used on the HCatalog server are:

property
  namehive.security.metastore.authorization.manager/name
  
valueorg.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider/value
/property

property
  namehive.security.metastore.authenticator.manager/name
  
valueorg.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator/value
/property

property
  namehive.metastore.pre.event.listeners/name
  
valueorg.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener/value
/property

 set hive.security.authorization.enabled can be executed by any user
 ---

 Key: HIVE-3591
 URL: https://issues.apache.org/jira/browse/HIVE-3591
 Project: Hive
  Issue Type: Bug
  Components: Authorization, CLI, Clients, JDBC
Affects Versions: 0.7.1
 Environment: RHEL 5.6
 CDH U3
Reporter: Dev Gupta
  Labels: Authorization, Security

 The property hive.security.authorization.enabled can be set to true or false, 
 by any user on the CLI, thus circumventing any previously set grants and 
 authorizations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [Discuss] project chop up

2013-08-21 Thread Thiruvel Thirumoolan
+1 Thanks Edward.

On 8/20/13 11:35 PM, amareshwari sriramdasu amareshw...@gmail.com
wrote:

Sounds great! Looking forward !


On Tue, Aug 20, 2013 at 7:58 PM, Edward Capriolo
edlinuxg...@gmail.comwrote:

 Just an update. This is going very well:

 NFO] Nothing to compile - all classes are up to date
 [INFO]
 
 [INFO] Reactor Summary:
 [INFO]
 [INFO] Apache Hive ... SUCCESS
[0.002s]
 [INFO] hive-shims-x .. SUCCESS
[1.210s]
 [INFO] hive-shims-20 . SUCCESS
[0.125s]
 [INFO] hive-common ... SUCCESS
[0.082s]
 [INFO] hive-serde  SUCCESS
[2.521s]
 [INFO] hive-metastore  SUCCESS
 [10.818s]
 [INFO] hive-exec . SUCCESS
[4.521s]
 [INFO] hive-avro . SUCCESS
[1.582s]
 [INFO] hive-zookeeper  SUCCESS
[0.519s]
 [INFO]
 
 [INFO] BUILD SUCCESS
 [INFO]
 
 [INFO] Total time: 21.613s
 [INFO] Finished at: Tue Aug 20 10:23:34 EDT 2013
 [INFO] Final Memory: 39M/408M


 Though I did some short cuts and disabled some tests. We can build hive
 very fast, including incremental builds. Also we are using maven
plugins to
 compile antlr, thrift, protobuf, datanucleas and building those every
time.


 On Fri, Aug 16, 2013 at 11:16 PM, Xuefu Zhang xzh...@cloudera.com
wrote:

  Thanks, Edward.
 
  I'm big +1 to mavenize Hive. Hive has long reached a point where it's
 hard
  to manage its build using ant. I'd like to help on this too.
 
  Thanks,
  Xuefu
 
 
  On Fri, Aug 16, 2013 at 7:31 PM, Edward Capriolo
edlinuxg...@gmail.com
  wrote:
 
   For those interested in pitching in.
   https://github.com/edwardcapriolo/hive
  
  
  
   On Fri, Aug 16, 2013 at 11:58 AM, Edward Capriolo 
 edlinuxg...@gmail.com
   wrote:
  
Summary from hive-irc channel. Minor edits for spell
check/grammar.
   
The last 10 lines are a summary of the key points.
   
[10:59:17] ecapriolo noland: et all. Do you want to talk about
hive
  in
maven?
[11:01:06] smonchi [~
ro...@host34-189-dynamic.23-79-r.retail.telecomitalia.it] has quit
  IRC:
Quit: ... 'cause there is no patch for human stupidity ...
[11:10:04] noland ecapriolo: yeah that sounds good to me!
[11:10:22] noland I saw you created the jira but haven't had
time
 to
   look
[11:10:32] ecapriolo So I found a few things
[11:10:49] ecapriolo In common there is one or two testats that
   actually
fork a process :)
[11:10:56] ecapriolo and use build.test.resources
[11:11:12] ecapriolo Some serde, uses some methods from ql in
 testing
[11:11:27] ecapriolo and shims really needs a separate hadoop
test
  shim
[11:11:32] ecapriolo But that is all simple stuff
[11:11:47] ecapriolo The biggest problem is I do not know how to
  solve
shims with maven
[11:11:50] ecapriolo do you have any ideas
[11:11:52] ecapriolo ?
[11:13:00] noland That one is going to be a challenge. It might
be
  that
in that section we have to drop down to ant
[11:14:44] noland Is it a requirement that we build both the .20
 and
   .23
shims for a package as we do today?
[11:16:46] ecapriolo I was thinking we can do it like a JDBC
driver
[11:16:59] ecapriolo Se separate out the interface of shims
[11:17:22] ecapriolo And then at runtime we drop in a driver
   implementing
[11:17:34] Wertax [~wer...@wolfkamp.xs4all.nl] has quit IRC:
Remote
  host
closed the connection
[11:17:36] ecapriolo That or we could use maven's profile system
[11:18:09] ecapriolo It seems that everything else can actually
 link
against hadoop-0.20.2 as a provided dependency
[11:18:37] noland Yeah either would work. The driver method
would
probably require use to use ant build both the drivers?
[11:18:44] noland I am a fan of mvn profiles
[11:19:05] ecapriolo I was thinking we kinda separate the shim
out
  into
its own project,, not a module
[11:19:10] ecapriolo to achive that jdbc thing
[11:19:27] ecapriolo But I do not have a solution yet, I was
 looking
  to
farm that out to someone smart...like you :)
[11:19:33] noland :)
[11:19:47] ecapriolo All I know is that we need a test shim
because
HadoopShim requires hadoop-test jars
[11:20:10] ecapriolo then the Mini stuff is only used in qtest
 anyway
[11:20:48] ecapriolo Is this something you want to help with? I
was
thinking of spinning up a github
[11:20:50] noland I think that the separate projects would work
and
perhaps nicely.
[11:21:01] noland Yeah I'd be interested in helping!

[jira] [Commented] (HIVE-3591) set hive.security.authorization.enabled can be executed by any user

2013-08-21 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746991#comment-13746991
 ] 

Larry McCay commented on HIVE-3591:
---

Okay, so this is already resolved - correct?


On Wed, Aug 21, 2013 at 7:07 PM, Thiruvel Thirumoolan (JIRA) 



 set hive.security.authorization.enabled can be executed by any user
 ---

 Key: HIVE-3591
 URL: https://issues.apache.org/jira/browse/HIVE-3591
 Project: Hive
  Issue Type: Bug
  Components: Authorization, CLI, Clients, JDBC
Affects Versions: 0.7.1
 Environment: RHEL 5.6
 CDH U3
Reporter: Dev Gupta
  Labels: Authorization, Security

 The property hive.security.authorization.enabled can be set to true or false, 
 by any user on the CLI, thus circumventing any previously set grants and 
 authorizations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5124) group by without map aggregation lead to mapreduce exception

2013-08-21 Thread Steven Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747001#comment-13747001
 ] 

Steven Wong commented on HIVE-5124:
---

Would you please provide a repro case?

 group by without map aggregation lead to mapreduce exception
 

 Key: HIVE-5124
 URL: https://issues.apache.org/jira/browse/HIVE-5124
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: cyril liao
Assignee: Bing Li

 On my environment, the same query but diffent by seting hive.map.aggr with 
 true or flase,produce different result.
 With hive.map.aggr=false,tasktracker report the following exception:
 java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
   ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:160)
   ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field value from [0:_col0, 
 1:_col1, 2:_col2, 3:_col3]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:82)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:299)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:62)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:438)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:153)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3591) set hive.security.authorization.enabled can be executed by any user

2013-08-21 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747010#comment-13747010
 ] 

Sushanth Sowmyan commented on HIVE-3591:


Good spot, Larry. That's one more thing to address about client-side 
authorization, and much more basic than the issue of any user being able to 
grant themselves permissions for anything. :D

[~ashutoshc] mentions that we have a notion of restrict-lists for HiveServer2, 
wherein it rejects attempts wherein users try set commands on restricted config 
parameters, and it might be a good idea to extend that notion to the hive 
client as well.

It still leaves open the case where the end user is able to edit their 
hive-site.xml to simply set the parameter there, rather than in-script or 
in-commandline, but that is protectable by admin policies for deployments, and 
might be a reasonable compromise.

That said, all of these still leave open the notion of being able edit/compile 
hive sources leaving out these protections on the client side, and thus, your 
metadata is not truly secure (data can be made secure by hdfs perms) unless 
you're using metastore-side authorization.

 set hive.security.authorization.enabled can be executed by any user
 ---

 Key: HIVE-3591
 URL: https://issues.apache.org/jira/browse/HIVE-3591
 Project: Hive
  Issue Type: Bug
  Components: Authorization, CLI, Clients, JDBC
Affects Versions: 0.7.1
 Environment: RHEL 5.6
 CDH U3
Reporter: Dev Gupta
  Labels: Authorization, Security

 The property hive.security.authorization.enabled can be set to true or false, 
 by any user on the CLI, thus circumventing any previously set grants and 
 authorizations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4617) ExecuteStatementAsync call to run a query in non-blocking mode

2013-08-21 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747014#comment-13747014
 ] 

Phabricator commented on HIVE-4617:
---

thejas has commented on the revision HIVE-4617 [jira] ExecuteStatementAsync 
call to run a query in non-blocking mode.

INLINE COMMENTS
  service/if/TCLIService.thrift:603 I think it is better to make use of a 
thrift optional parameter here, instead of creating a new function. Creating a 
new thrift function every time we add parameters would be bad.
  service/if/TCLIService.thrift:623 This is also not needed, the contents are 
same as TExecuteStatementResp
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java:68 A 
sessionstate does not have to be stored here.
  It should just use SessionState.get() and use that to set sessionstate in new 
thread
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java:122 
add a SessionState ss = SessionState.get() and set it in the new thread.
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java:183 
this call would go away when sessionstate is not stored in SQLOperation
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java:59 
since the state object in operation is going to be accessed by multiple 
threads, i think it is safer to make the instance a volatile instance.
  service/src/test/org/apache/hive/service/cli/CLIServiceTest.java:140 can you 
add a max wait for this ? If something goes wrong the test will run for ever.

REVISION DETAIL
  https://reviews.facebook.net/D12417

To: JIRA, vaibhavgumashta
Cc: thejas


 ExecuteStatementAsync call to run a query in non-blocking mode
 --

 Key: HIVE-4617
 URL: https://issues.apache.org/jira/browse/HIVE-4617
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Jaideep Dhok
Assignee: Vaibhav Gumashta
 Attachments: HIVE-4617.D12417.1.patch


 Provide a way to run a queries asynchronously. Current executeStatement call 
 blocks until the query run is complete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-5134) add tests to partition filter JDO pushdown for like and make sure it works, or remove it

2013-08-21 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-5134:
--

 Summary: add tests to partition filter JDO pushdown for like and 
make sure it works, or remove it
 Key: HIVE-5134
 URL: https://issues.apache.org/jira/browse/HIVE-5134
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


There's a mailing list thread. Partition filtering w/JDO pushdown using LIKE is 
not used by Hive due to client check (in PartitionPruner); after enabling it 
seems to be broken. We need to fix and enable it, or remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3591) set hive.security.authorization.enabled can be executed by any user

2013-08-21 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747019#comment-13747019
 ] 

Sushanth Sowmyan commented on HIVE-3591:


[~lmccay] : I wouldn't say resolved, per se - the issue you bring is a valid 
one, but one that does not fit the original hive security design (designed to 
prevent people from accidentally doing something dangerous, as opposed to being 
designed to prevent malicious users). For the security-conscious, there is 
currently a work-around(metastore-side security) for the intermediate case 
where stronger security is needed.

I think this is an important data point though, for us to consider when trying 
to nail down hive security, and there is some intermediate work possible for 
this in the short run as well(the above restricted conf idea)

 set hive.security.authorization.enabled can be executed by any user
 ---

 Key: HIVE-3591
 URL: https://issues.apache.org/jira/browse/HIVE-3591
 Project: Hive
  Issue Type: Bug
  Components: Authorization, CLI, Clients, JDBC
Affects Versions: 0.7.1
 Environment: RHEL 5.6
 CDH U3
Reporter: Dev Gupta
  Labels: Authorization, Security

 The property hive.security.authorization.enabled can be set to true or false, 
 by any user on the CLI, thus circumventing any previously set grants and 
 authorizations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4617) ExecuteStatementAsync call to run a query in non-blocking mode

2013-08-21 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747020#comment-13747020
 ] 

Phabricator commented on HIVE-4617:
---

thejas has commented on the revision HIVE-4617 [jira] ExecuteStatementAsync 
call to run a query in non-blocking mode.

INLINE COMMENTS
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:737 can you also 
add these to with description to conf/hive-default.xml.template ?
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:738 As you 
mentioned offline, since the term task is used in other contexts in hive, it 
might be better to rename this . Maybe, rename from 
hive.server2.thrift.async.task* to hive.server2.thrift.asyncexec.* ?

REVISION DETAIL
  https://reviews.facebook.net/D12417

To: JIRA, vaibhavgumashta
Cc: thejas


 ExecuteStatementAsync call to run a query in non-blocking mode
 --

 Key: HIVE-4617
 URL: https://issues.apache.org/jira/browse/HIVE-4617
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Jaideep Dhok
Assignee: Vaibhav Gumashta
 Attachments: HIVE-4617.D12417.1.patch


 Provide a way to run a queries asynchronously. Current executeStatement call 
 blocks until the query run is complete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4925) Modify Hive build to enable compiling and running Hive with JDK7

2013-08-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747046#comment-13747046
 ] 

Hive QA commented on HIVE-4925:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12598983/HIVE-4925.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2895 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/497/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/497/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

 Modify Hive build to enable compiling and running Hive with JDK7
 

 Key: HIVE-4925
 URL: https://issues.apache.org/jira/browse/HIVE-4925
 Project: Hive
  Issue Type: Sub-task
  Components: Build Infrastructure
Affects Versions: 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.12.0

 Attachments: HIVE-4925.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3591) set hive.security.authorization.enabled can be executed by any user

2013-08-21 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747078#comment-13747078
 ] 

Larry McCay commented on HIVE-3591:
---

I was looking at the restrictList earlier for this. I'll look into it further. 
Thanks for the insight!

 set hive.security.authorization.enabled can be executed by any user
 ---

 Key: HIVE-3591
 URL: https://issues.apache.org/jira/browse/HIVE-3591
 Project: Hive
  Issue Type: Bug
  Components: Authorization, CLI, Clients, JDBC
Affects Versions: 0.7.1
 Environment: RHEL 5.6
 CDH U3
Reporter: Dev Gupta
  Labels: Authorization, Security

 The property hive.security.authorization.enabled can be set to true or false, 
 by any user on the CLI, thus circumventing any previously set grants and 
 authorizations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5124) group by without map aggregation lead to mapreduce exception

2013-08-21 Thread cyril liao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747102#comment-13747102
 ] 

cyril liao commented on HIVE-5124:
--

The sql is :
SELECT channeled,
   max(VV)AS vv,
   max(FUV) AS FUV,
   max(PV) AS PV,
   max(UV) AS UV
FROM
(
SELECT  channeled,
sum(CASE WHEN TYPE = 1 THEN a else cast( 0 as bigint) END) AS VV,
sum(CASE WHEN TYPE = 1 THEN b else cast( 0 as bigint) END) AS FUV,
sum(CASE WHEN TYPE = 2 THEN a else cast (0 as bigint) END) AS PV,
sum(CASE WHEN TYPE = 2 THEN b else cast (0 as bigint) END) AS UV
 FROM
 (SELECT count(uid) AS a, count(DISTINCT uid) AS b, TYPE, channeled
  FROM
  (SELECT uid,
 channeled,
  TYPE
  FROM
  (SELECT uid,
  parse_url(url,'QUERY','channeled') as channeled,
  1 AS TYPE
   FROM t_html5_vv
   WHERE  p_day = ${idate}
   UNION ALL 
   SELECT uid,
 parse_url(url,'QUERY','channeled') as 
channeled,
2 AS TYPE
   FROM t_html5_pv
   WHERE  p_day = ${idate}
   )tmp
   where channeled is not null  and channeled  ''
) tmp2
 GROUP BY channeled,TYPE
)tmp3
GROUP BY channeled)tmp4
GROUP BY channeled

I want to get uv and fuv from different table,t_html5_vv and t_html5_pv ,and 
combine the result in one row. The default  hive.map.aggr argument in 
hive-site.xml is setted to true,and the sql goes prefect.But  the exception is 
thrown when i set hive.map.aggr= false.

 group by without map aggregation lead to mapreduce exception
 

 Key: HIVE-5124
 URL: https://issues.apache.org/jira/browse/HIVE-5124
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: cyril liao
Assignee: Bing Li

 On my environment, the same query but diffent by seting hive.map.aggr with 
 true or flase,produce different result.
 With hive.map.aggr=false,tasktracker report the following exception:
 java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
   ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:160)
   ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field value from [0:_col0, 
 1:_col1, 2:_col2, 3:_col3]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:82)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:299)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:62)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at 

[jira] [Created] (HIVE-5135) HCatalog test TestE2EScenarios fails with hadoop 2.x

2013-08-21 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-5135:
--

 Summary: HCatalog test TestE2EScenarios fails with hadoop 2.x
 Key: HIVE-5135
 URL: https://issues.apache.org/jira/browse/HIVE-5135
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan


HIVE-4388 makes a first couple of changes needed to fix unit tests with hadoop 
2.x, and also modifies TestE2EScenarios to bring it up to date to use Shims, 
but TestE2EScenarios still fails because TaskAttemptId being instantiated with 
no arguments fails under hadoop 2.x.

I'm attaching a patch here which sits on top of HIVE-4388 to fix the test under 
hadoop 2.x, but is a WIP. After HIVE-4388 gets committed, I will revisit this 
to check if we need to shim out TaskAttemptID or not, and test across versions.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5135) HCatalog test TestE2EScenarios fails with hadoop 2.x

2013-08-21 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-5135:
---

Attachment: e2e.wip.patch

 HCatalog test TestE2EScenarios fails with hadoop 2.x
 

 Key: HIVE-5135
 URL: https://issues.apache.org/jira/browse/HIVE-5135
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: e2e.wip.patch


 HIVE-4388 makes a first couple of changes needed to fix unit tests with 
 hadoop 2.x, and also modifies TestE2EScenarios to bring it up to date to use 
 Shims, but TestE2EScenarios still fails because TaskAttemptId being 
 instantiated with no arguments fails under hadoop 2.x.
 I'm attaching a patch here which sits on top of HIVE-4388 to fix the test 
 under hadoop 2.x, but is a WIP. After HIVE-4388 gets committed, I will 
 revisit this to check if we need to shim out TaskAttemptID or not, and test 
 across versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5104) HCatStorer fails to store boolean type

2013-08-21 Thread Karl D. Gierach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl D. Gierach updated HIVE-5104:
--

Release Note: Saving records that contain Boolean type's is now supported 
via Pig's HCatStorer.

 HCatStorer fails to store boolean type
 --

 Key: HIVE-5104
 URL: https://issues.apache.org/jira/browse/HIVE-5104
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Ron Frohock
 Attachments: HIVE-5104.patch


 Unable to store boolean values to HCat table 
 Assume in Hive you have two tables...
 CREATE TABLE btest(test as boolean);
 CREATE TABLE btest2(test as boolean);
 Then in Pig 
 A = LOAD 'btest' USING org.apache.hcatalog.pig.HCatLoader();
 STORE A INTO 'btest2' USING org.apache.hcatalog.pig.HCatStorer();
 You will get an ERROR 115: Unsupported type 5: in Pig's Schema  
 Checking HCatBaseStorer.java, the case for data types doesn't check for 
 booleans.  Might have been overlooked in adding boolean to Pig in 0.10

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5104) HCatStorer fails to store boolean type

2013-08-21 Thread Karl D. Gierach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl D. Gierach updated HIVE-5104:
--

Attachment: HIVE-5104.1.patch.txt

re-generated patch with correct git option --no-prefix.

 HCatStorer fails to store boolean type
 --

 Key: HIVE-5104
 URL: https://issues.apache.org/jira/browse/HIVE-5104
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Ron Frohock
 Attachments: HIVE-5104.1.patch.txt, HIVE-5104.patch


 Unable to store boolean values to HCat table 
 Assume in Hive you have two tables...
 CREATE TABLE btest(test as boolean);
 CREATE TABLE btest2(test as boolean);
 Then in Pig 
 A = LOAD 'btest' USING org.apache.hcatalog.pig.HCatLoader();
 STORE A INTO 'btest2' USING org.apache.hcatalog.pig.HCatStorer();
 You will get an ERROR 115: Unsupported type 5: in Pig's Schema  
 Checking HCatBaseStorer.java, the case for data types doesn't check for 
 booleans.  Might have been overlooked in adding boolean to Pig in 0.10

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5104) HCatStorer fails to store boolean type

2013-08-21 Thread Karl D. Gierach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl D. Gierach updated HIVE-5104:
--

Attachment: (was: HIVE-5104.patch)

 HCatStorer fails to store boolean type
 --

 Key: HIVE-5104
 URL: https://issues.apache.org/jira/browse/HIVE-5104
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Ron Frohock
 Attachments: HIVE-5104.1.patch.txt


 Unable to store boolean values to HCat table 
 Assume in Hive you have two tables...
 CREATE TABLE btest(test as boolean);
 CREATE TABLE btest2(test as boolean);
 Then in Pig 
 A = LOAD 'btest' USING org.apache.hcatalog.pig.HCatLoader();
 STORE A INTO 'btest2' USING org.apache.hcatalog.pig.HCatStorer();
 You will get an ERROR 115: Unsupported type 5: in Pig's Schema  
 Checking HCatBaseStorer.java, the case for data types doesn't check for 
 booleans.  Might have been overlooked in adding boolean to Pig in 0.10

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode

2013-08-21 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4961:
--

Attachment: vectorUDF.4.patch

Attaching mostly working version of change for safekeeping.

 Create bridge for custom UDFs to operate in vectorized mode
 ---

 Key: HIVE-4961
 URL: https://issues.apache.org/jira/browse/HIVE-4961
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: vectorUDF.4.patch


 Suppose you have a custom UDF myUDF() that you've created to extend hive. The 
 goal of this JIRA is to create a facility where if you run a query that uses 
 myUDF() in an expression, the query will run in vectorized mode.
 This would be a general-purpose bridge for custom UDFs that users add to 
 Hive. It would work with existing UDFs.
 I'm considering a separate JIRA for a new kind of custom UDF implementation 
 that is vectorized from the beginning, to optimize performance. That is not 
 covered by this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5128) Direct SQL for view is failing

2013-08-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747145#comment-13747145
 ] 

Sergey Shelukhin commented on HIVE-5128:


let me check today/tomorrow

 Direct SQL for view is failing 
 ---

 Key: HIVE-5128
 URL: https://issues.apache.org/jira/browse/HIVE-5128
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Priority: Trivial

 I cannot sure of this, but dropping views, (it rolls back to JPA and works 
 fine)
 {noformat}
 etastore.ObjectStore: Direct SQL failed, falling back to ORM
 MetaException(message:Unexpected null for one of the IDs, SD null, column 
 null, serde null)
   at 
 org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:195)
   at 
 org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:98)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1758)
 ...
 {noformat}
 Should it be disabled for views or can be fixed?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5134) add tests to partition filter JDO pushdown for like and make sure it works, or remove it

2013-08-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-5134:
---

Attachment: HIVE-5134-does-not-work.patch

Here's an example test (with some implementation), on top of HIVE-4914, that is 
not going to work.

If I remove the custom implementation and switch to 
UDFLike::likePatternToRegExp (moved into common to access from metastore), even 
fewer patterns from this test work.

We should be able to add this test and make it work in some way, or remove the 
code.

 add tests to partition filter JDO pushdown for like and make sure it works, 
 or remove it
 

 Key: HIVE-5134
 URL: https://issues.apache.org/jira/browse/HIVE-5134
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-5134-does-not-work.patch


 There's a mailing list thread. Partition filtering w/JDO pushdown using LIKE 
 is not used by Hive due to client check (in PartitionPruner); after enabling 
 it seems to be broken. We need to fix and enable it, or remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5091) ORC files should have an option to pad stripes to the HDFS block boundaries

2013-08-21 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747154#comment-13747154
 ] 

Phabricator commented on HIVE-5091:
---

hagleitn has commented on the revision HIVE-5091 [jira] ORC files should have 
an option to pad stripes to the HDFS block boundaries.

  LGTM. I like the new WriterOptions. Nice and clean. +1

REVISION DETAIL
  https://reviews.facebook.net/D12249

To: JIRA, omalley
Cc: hagleitn


 ORC files should have an option to pad stripes to the HDFS block boundaries
 ---

 Key: HIVE-5091
 URL: https://issues.apache.org/jira/browse/HIVE-5091
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-5091.D12249.1.patch


 With ORC stripes being large, if a stripe straddles an HDFS block, the 
 locality of read is suboptimal. It would be good to add padding to ensure 
 that stripes don't straddle HDFS blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5129) Multiple table insert fails on count(distinct)

2013-08-21 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-5129:
-

Attachment: HIVE-5129.2.WIP.patch.txt

[~navis] I think this patch addresses what you intended. Could you please take 
a look. I ran some tests on it and it looks ok. I will include your query above 
in the test and upload once I have your feedback.

Thanks
Vikram.

 Multiple table insert fails on count(distinct)
 --

 Key: HIVE-5129
 URL: https://issues.apache.org/jira/browse/HIVE-5129
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: aggrTestMultiInsertData1.txt, 
 aggrTestMultiInsertData.txt, aggrTestMultiInsert.q, HIVE-5129.1.patch.txt, 
 HIVE-5129.2.WIP.patch.txt


 Hive fails with a class cast exception on queries of the form:
 {noformat}
 from studenttab10k
 insert overwrite table multi_insert_2_1
 select name, avg(age) as avgage
 group by name
 insert overwrite table multi_insert_2_2
 select name, age, sum(gpa) as sumgpa
 group by name, age
 insert overwrite table multi_insert_2_3
 select name, count(distinct age) as distage
 group by name;
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4914) filtering via partition name should be done inside metastore server (implementation)

2013-08-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-4914:
---

Attachment: HIVE-4914.patch

I have moved handling LIKE pushdown into separate JIRA because it takes 
inordinate amount of pain and delays this needlessly.
Virtual columns issue was also fixed in a separate JIRA (it caused ppd_vc to 
fail in the last run in this one).
Now it should work.

Note that new client cannot talk to old metastore server because API is not 
available... I am not certain whether backward compat of that sort is needed, I 
can add it by re-adding some old code in deprecated form.

RB also updated. Most of the patch is still generated code.

 filtering via partition name should be done inside metastore server 
 (implementation)
 

 Key: HIVE-4914
 URL: https://issues.apache.org/jira/browse/HIVE-4914
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-4914-only-no-gen.patch, HIVE-4914-only.patch, 
 HIVE-4914.patch, HIVE-4914.patch, HIVE-4914.patch


 Currently, if the filter pushdown is impossible (which is most cases), the 
 client gets all partition names from metastore, filters them, and asks for 
 partitions by names for the filtered set.
 Metastore server code should do that instead; it should check if pushdown is 
 possible and do it if so; otherwise it should do name-based filtering.
 Saves the roundtrip with all partition names from the server to client, and 
 also removes the need to have pushdown viability checking on both sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request 13697: HIVE-5129: Multiple table insert fails on count(distinct)

2013-08-21 Thread Vikram Dixit Kumaraswamy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13697/
---

(Updated Aug. 22, 2013, 2:16 a.m.)


Review request for hive and Navis Ryu.


Changes
---

Made changes according to Navis' suggestion.


Bugs: HIVE-5129
https://issues.apache.org/jira/browse/HIVE-5129


Repository: hive-git


Description
---

Hive fails with class cast exception on multiple table insert fails on 
count(distinct).


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java a39fd21 

Diff: https://reviews.apache.org/r/13697/diff/


Testing
---

Runs the test from HIVE-4692 successfully.


Thanks,

Vikram Dixit Kumaraswamy



[jira] [Updated] (HIVE-5129) Multiple table insert fails on count(distinct)

2013-08-21 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-5129:
-

Status: Patch Available  (was: Open)

 Multiple table insert fails on count(distinct)
 --

 Key: HIVE-5129
 URL: https://issues.apache.org/jira/browse/HIVE-5129
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: aggrTestMultiInsertData1.txt, 
 aggrTestMultiInsertData.txt, aggrTestMultiInsert.q, HIVE-5129.1.patch.txt, 
 HIVE-5129.2.WIP.patch.txt


 Hive fails with a class cast exception on queries of the form:
 {noformat}
 from studenttab10k
 insert overwrite table multi_insert_2_1
 select name, avg(age) as avgage
 group by name
 insert overwrite table multi_insert_2_2
 select name, age, sum(gpa) as sumgpa
 group by name, age
 insert overwrite table multi_insert_2_3
 select name, count(distinct age) as distage
 group by name;
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5129) Multiple table insert fails on count(distinct)

2013-08-21 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-5129:
-

Status: Open  (was: Patch Available)

 Multiple table insert fails on count(distinct)
 --

 Key: HIVE-5129
 URL: https://issues.apache.org/jira/browse/HIVE-5129
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: aggrTestMultiInsertData1.txt, 
 aggrTestMultiInsertData.txt, aggrTestMultiInsert.q, HIVE-5129.1.patch.txt, 
 HIVE-5129.2.WIP.patch.txt


 Hive fails with a class cast exception on queries of the form:
 {noformat}
 from studenttab10k
 insert overwrite table multi_insert_2_1
 select name, avg(age) as avgage
 group by name
 insert overwrite table multi_insert_2_2
 select name, age, sum(gpa) as sumgpa
 group by name, age
 insert overwrite table multi_insert_2_3
 select name, count(distinct age) as distage
 group by name;
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5091) ORC files should have an option to pad stripes to the HDFS block boundaries

2013-08-21 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747165#comment-13747165
 ] 

Gunther Hagleitner commented on HIVE-5091:
--

Looked at the failing tests. The problem is the *NOT*USED*, that will be 
passed as the desired version which leads to an Exception. [~owen.omalley]: I 
think you want to change that to null. Other than that looks good.

 ORC files should have an option to pad stripes to the HDFS block boundaries
 ---

 Key: HIVE-5091
 URL: https://issues.apache.org/jira/browse/HIVE-5091
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-5091.D12249.1.patch


 With ORC stripes being large, if a stripe straddles an HDFS block, the 
 locality of read is suboptimal. It would be good to add padding to ensure 
 that stripes don't straddle HDFS blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5128) Direct SQL for view is failing

2013-08-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747169#comment-13747169
 ] 

Sergey Shelukhin commented on HIVE-5128:


it may be bad code in a sense that we expect these things to be set, and they 
are correctly not set for views. Maybe it should handle that.

 Direct SQL for view is failing 
 ---

 Key: HIVE-5128
 URL: https://issues.apache.org/jira/browse/HIVE-5128
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Priority: Trivial

 I cannot sure of this, but dropping views, (it rolls back to JPA and works 
 fine)
 {noformat}
 etastore.ObjectStore: Direct SQL failed, falling back to ORM
 MetaException(message:Unexpected null for one of the IDs, SD null, column 
 null, serde null)
   at 
 org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:195)
   at 
 org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:98)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1758)
 ...
 {noformat}
 Should it be disabled for views or can be fixed?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4214) OVER accepts general expression instead of just function

2013-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747174#comment-13747174
 ] 

Hudson commented on HIVE-4214:
--

FAILURE: Integrated in Hive-trunk-hadoop2-ptest #67 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/67/])
HIVE-4214 : OVER accepts general expression instead of just function (Ashutosh 
Chauhan Reviewed by Harish Butani) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1516180)
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SelectClauseParser.g
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* 
/hive/trunk/ql/src/test/queries/clientnegative/ptf_negative_HavingLeadWithNoGBYNoWindowing.q
* 
/hive/trunk/ql/src/test/queries/clientnegative/ptf_negative_HavingLeadWithPTF.q
* 
/hive/trunk/ql/src/test/queries/clientnegative/ptf_negative_InvalidValueBoundary.q
* 
/hive/trunk/ql/src/test/queries/clientnegative/ptf_negative_WhereWithRankCond.q
* /hive/trunk/ql/src/test/queries/clientnegative/windowing_leadlag_in_udaf.q
* /hive/trunk/ql/src/test/queries/clientnegative/windowing_ll_no_over.q
* /hive/trunk/ql/src/test/queries/clientpositive/ctas_colname.q
* /hive/trunk/ql/src/test/queries/clientpositive/ptf.q
* /hive/trunk/ql/src/test/queries/clientpositive/windowing.q
* /hive/trunk/ql/src/test/queries/clientpositive/windowing_expressions.q
* /hive/trunk/ql/src/test/queries/clientpositive/windowing_windowspec.q
* 
/hive/trunk/ql/src/test/results/clientnegative/ptf_negative_HavingLeadWithNoGBYNoWindowing.q.out
* 
/hive/trunk/ql/src/test/results/clientnegative/ptf_negative_WhereWithRankCond.q.out
* /hive/trunk/ql/src/test/results/clientnegative/windowing_leadlag_in_udaf.q.out
* /hive/trunk/ql/src/test/results/clientnegative/windowing_ll_no_over.q.out
* /hive/trunk/ql/src/test/results/clientpositive/correlationoptimizer12.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ctas_colname.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ptf.q.out
* /hive/trunk/ql/src/test/results/clientpositive/windowing.q.out
* /hive/trunk/ql/src/test/results/clientpositive/windowing_expressions.q.out
* /hive/trunk/ql/src/test/results/clientpositive/windowing_windowspec.q.out


 OVER accepts general expression instead of just function
 

 Key: HIVE-4214
 URL: https://issues.apache.org/jira/browse/HIVE-4214
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Ashutosh Chauhan
 Fix For: 0.12.0

 Attachments: HIVE-4214.1.patch, HIVE-4214.3.patch, HIVE-4214.patch


 The query:
 select s, i, avg(d) / 10.0 over (partition by s order by i) from over100k;
 runs (and produces meaningless output).
 Over should not allow the arithmetic expression.  Only a UDAF or PTF function 
 should be valid there.  The correct way to write this query should be 
 select s, i, avg(d) over (partition by s order by i) / 10. 0 from over100k;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5129) Multiple table insert fails on count(distinct)

2013-08-21 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747173#comment-13747173
 ] 

Navis commented on HIVE-5129:
-

It seemed current distinct implementation has flaw.

{noformat}
select key, count(distinct key) + count(distinct value) from src tablesample 
(10 ROWS) group by key;

100 1
val_100 1
165 1
val_165 1
238 1
val_238 1
255 1
val_255 1
27  1
val_27  1
278 1
val_278 1
311 1
val_311 1
409 1
val_409 1
86  1
val_86  1
98  1
val_98  1
{noformat}

this does not make sense.

 Multiple table insert fails on count(distinct)
 --

 Key: HIVE-5129
 URL: https://issues.apache.org/jira/browse/HIVE-5129
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: aggrTestMultiInsertData1.txt, 
 aggrTestMultiInsertData.txt, aggrTestMultiInsert.q, HIVE-5129.1.patch.txt, 
 HIVE-5129.2.WIP.patch.txt


 Hive fails with a class cast exception on queries of the form:
 {noformat}
 from studenttab10k
 insert overwrite table multi_insert_2_1
 select name, avg(age) as avgage
 group by name
 insert overwrite table multi_insert_2_2
 select name, age, sum(gpa) as sumgpa
 group by name, age
 insert overwrite table multi_insert_2_3
 select name, count(distinct age) as distage
 group by name;
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-5136) HCatalog HBase Storage handler fails test with protbuf2.5

2013-08-21 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-5136:
--

 Summary: HCatalog HBase Storage handler fails test with protbuf2.5
 Key: HIVE-5136
 URL: https://issues.apache.org/jira/browse/HIVE-5136
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan


With HIVE-5112 updating protobuf to 2.5, RevisionManagerEndpointProtos.java in 
HCat needs to be updated and recompiled with protobuf2.5

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5136) HCatalog HBase Storage handler fails test with protbuf2.5

2013-08-21 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-5136:
---

Description: With HIVE-5112 updating protobuf to 2.5, 
RevisionManagerEndpointProtos.java brought in by HIVE-4388 in HCat needs to be 
updated and recompiled with protobuf2.5  (was: With HIVE-5112 updating protobuf 
to 2.5, RevisionManagerEndpointProtos.java in HCat needs to be updated and 
recompiled with protobuf2.5)

 HCatalog HBase Storage handler fails test with protbuf2.5
 -

 Key: HIVE-5136
 URL: https://issues.apache.org/jira/browse/HIVE-5136
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan

 With HIVE-5112 updating protobuf to 2.5, RevisionManagerEndpointProtos.java 
 brought in by HIVE-4388 in HCat needs to be updated and recompiled with 
 protobuf2.5

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5136) HCatalog HBase Storage handler fails test with protbuf2.5

2013-08-21 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-5136:
---

Attachment: hbase-protobuf-update.patch

 HCatalog HBase Storage handler fails test with protbuf2.5
 -

 Key: HIVE-5136
 URL: https://issues.apache.org/jira/browse/HIVE-5136
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
 Attachments: hbase-protobuf-update.patch


 With HIVE-5112 updating protobuf to 2.5, RevisionManagerEndpointProtos.java 
 brought in by HIVE-4388 in HCat needs to be updated and recompiled with 
 protobuf2.5

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4588) Support session level hooks for HiveServer2

2013-08-21 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747206#comment-13747206
 ] 

Navis commented on HIVE-4588:
-

[~prasadm] plz make a RB or phabricator entry for a review.

HiveSessionHook : extend org.apache.hadoop.hive.ql.hooks.Hook?
HiveSessionHookContext.getSessionHandle() : would it be better to return 
SessionHandle rather than SessionHandle.toString()?
SessionManager : how about extract method Driver.getHooks() into some utility 
class(JavaUtil?) and use that?

 Support session level hooks for HiveServer2
 ---

 Key: HIVE-4588
 URL: https://issues.apache.org/jira/browse/HIVE-4588
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4588-1.patch, HIVE-4588.3.patch


 Support session level hooks for HiveSrver2. The configured hooks will get 
 executed at beginning of each new session.
 This is useful for auditing connections, possibly tuning the session level 
 properties etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1719) Move RegexSerDe out of hive-contrib and over to hive-serde

2013-08-21 Thread efan lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747211#comment-13747211
 ] 

efan lee commented on HIVE-1719:


Why is the RegexSerDe.java still exists in contrib directory? Is'nt it 
necessary to remove the file from contrib?

 Move RegexSerDe out of hive-contrib and over to hive-serde
 --

 Key: HIVE-1719
 URL: https://issues.apache.org/jira/browse/HIVE-1719
 Project: Hive
  Issue Type: Task
  Components: Serializers/Deserializers
Reporter: Carl Steinbach
Assignee: Shreepadma Venugopalan
 Fix For: 0.10.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-1719.D3051.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-1719.D3051.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-1719.D3141.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-1719.D3249.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-1719.D3249.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-1719.D3249.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-1719.D3249.4.patch, HIVE-1719.3.patch, 
 HIVE-1719.D3249.1.patch


 RegexSerDe is as much a part of the standard Hive distribution as the other 
 SerDes
 currently in hive-serde. I think we should move it over to the hive-serde 
 module so that
 users don't have to go to the added effort of manually registering the 
 contrib jar before
 using it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4214) OVER accepts general expression instead of just function

2013-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747219#comment-13747219
 ] 

Hudson commented on HIVE-4214:
--

SUCCESS: Integrated in Hive-trunk-hadoop1-ptest #135 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/135/])
HIVE-4214 : OVER accepts general expression instead of just function (Ashutosh 
Chauhan Reviewed by Harish Butani) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1516180)
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SelectClauseParser.g
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* 
/hive/trunk/ql/src/test/queries/clientnegative/ptf_negative_HavingLeadWithNoGBYNoWindowing.q
* 
/hive/trunk/ql/src/test/queries/clientnegative/ptf_negative_HavingLeadWithPTF.q
* 
/hive/trunk/ql/src/test/queries/clientnegative/ptf_negative_InvalidValueBoundary.q
* 
/hive/trunk/ql/src/test/queries/clientnegative/ptf_negative_WhereWithRankCond.q
* /hive/trunk/ql/src/test/queries/clientnegative/windowing_leadlag_in_udaf.q
* /hive/trunk/ql/src/test/queries/clientnegative/windowing_ll_no_over.q
* /hive/trunk/ql/src/test/queries/clientpositive/ctas_colname.q
* /hive/trunk/ql/src/test/queries/clientpositive/ptf.q
* /hive/trunk/ql/src/test/queries/clientpositive/windowing.q
* /hive/trunk/ql/src/test/queries/clientpositive/windowing_expressions.q
* /hive/trunk/ql/src/test/queries/clientpositive/windowing_windowspec.q
* 
/hive/trunk/ql/src/test/results/clientnegative/ptf_negative_HavingLeadWithNoGBYNoWindowing.q.out
* 
/hive/trunk/ql/src/test/results/clientnegative/ptf_negative_WhereWithRankCond.q.out
* /hive/trunk/ql/src/test/results/clientnegative/windowing_leadlag_in_udaf.q.out
* /hive/trunk/ql/src/test/results/clientnegative/windowing_ll_no_over.q.out
* /hive/trunk/ql/src/test/results/clientpositive/correlationoptimizer12.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ctas_colname.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ptf.q.out
* /hive/trunk/ql/src/test/results/clientpositive/windowing.q.out
* /hive/trunk/ql/src/test/results/clientpositive/windowing_expressions.q.out
* /hive/trunk/ql/src/test/results/clientpositive/windowing_windowspec.q.out


 OVER accepts general expression instead of just function
 

 Key: HIVE-4214
 URL: https://issues.apache.org/jira/browse/HIVE-4214
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Ashutosh Chauhan
 Fix For: 0.12.0

 Attachments: HIVE-4214.1.patch, HIVE-4214.3.patch, HIVE-4214.patch


 The query:
 select s, i, avg(d) / 10.0 over (partition by s order by i) from over100k;
 runs (and produces meaningless output).
 Over should not allow the arithmetic expression.  Only a UDAF or PTF function 
 should be valid there.  The correct way to write this query should be 
 select s, i, avg(d) over (partition by s order by i) / 10. 0 from over100k;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4779) Enhance coverage of package org.apache.hadoop.hive.ql.udf

2013-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747220#comment-13747220
 ] 

Hudson commented on HIVE-4779:
--

SUCCESS: Integrated in Hive-trunk-hadoop1-ptest #135 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/135/])
HIVE-4779 : Enhance coverage of package org.apache.hadoop.hive.ql.udf (Ivan 
Veselovsky via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1515946)
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseCompare.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToDate.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFUnixTimeStamp.java
* /hive/trunk/ql/src/test/queries/clientpositive/create_udaf.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf4.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_pmod.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_boolean.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_byte.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_double.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_float.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_long.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_short.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_to_string.q
* /hive/trunk/ql/src/test/results/clientpositive/create_udaf.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_pmod.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_boolean.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_byte.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_double.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_float.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_long.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_short.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_to_string.q.out


 Enhance coverage of package org.apache.hadoop.hive.ql.udf
 -

 Key: HIVE-4779
 URL: https://issues.apache.org/jira/browse/HIVE-4779
 Project: Hive
  Issue Type: Test
Affects Versions: 0.12.0
Reporter: Ivan A. Veselovsky
Assignee: Ivan A. Veselovsky
 Fix For: 0.12.0

 Attachments: HIVE-4779.patch, HIVE-4779-trunk--N1.patch


 Enhance coverage of package org.apache.hadoop.hive.ql.udf up to 80%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4214) OVER accepts general expression instead of just function

2013-08-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747235#comment-13747235
 ] 

Hudson commented on HIVE-4214:
--

FAILURE: Integrated in Hive-trunk-h0.21 #2282 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2282/])
HIVE-4214 : OVER accepts general expression instead of just function (Ashutosh 
Chauhan Reviewed by Harish Butani) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1516180)
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SelectClauseParser.g
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* 
/hive/trunk/ql/src/test/queries/clientnegative/ptf_negative_HavingLeadWithNoGBYNoWindowing.q
* 
/hive/trunk/ql/src/test/queries/clientnegative/ptf_negative_HavingLeadWithPTF.q
* 
/hive/trunk/ql/src/test/queries/clientnegative/ptf_negative_InvalidValueBoundary.q
* 
/hive/trunk/ql/src/test/queries/clientnegative/ptf_negative_WhereWithRankCond.q
* /hive/trunk/ql/src/test/queries/clientnegative/windowing_leadlag_in_udaf.q
* /hive/trunk/ql/src/test/queries/clientnegative/windowing_ll_no_over.q
* /hive/trunk/ql/src/test/queries/clientpositive/ctas_colname.q
* /hive/trunk/ql/src/test/queries/clientpositive/ptf.q
* /hive/trunk/ql/src/test/queries/clientpositive/windowing.q
* /hive/trunk/ql/src/test/queries/clientpositive/windowing_expressions.q
* /hive/trunk/ql/src/test/queries/clientpositive/windowing_windowspec.q
* 
/hive/trunk/ql/src/test/results/clientnegative/ptf_negative_HavingLeadWithNoGBYNoWindowing.q.out
* 
/hive/trunk/ql/src/test/results/clientnegative/ptf_negative_WhereWithRankCond.q.out
* /hive/trunk/ql/src/test/results/clientnegative/windowing_leadlag_in_udaf.q.out
* /hive/trunk/ql/src/test/results/clientnegative/windowing_ll_no_over.q.out
* /hive/trunk/ql/src/test/results/clientpositive/correlationoptimizer12.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ctas_colname.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ptf.q.out
* /hive/trunk/ql/src/test/results/clientpositive/windowing.q.out
* /hive/trunk/ql/src/test/results/clientpositive/windowing_expressions.q.out
* /hive/trunk/ql/src/test/results/clientpositive/windowing_windowspec.q.out


 OVER accepts general expression instead of just function
 

 Key: HIVE-4214
 URL: https://issues.apache.org/jira/browse/HIVE-4214
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Ashutosh Chauhan
 Fix For: 0.12.0

 Attachments: HIVE-4214.1.patch, HIVE-4214.3.patch, HIVE-4214.patch


 The query:
 select s, i, avg(d) / 10.0 over (partition by s order by i) from over100k;
 runs (and produces meaningless output).
 Over should not allow the arithmetic expression.  Only a UDAF or PTF function 
 should be valid there.  The correct way to write this query should be 
 select s, i, avg(d) over (partition by s order by i) / 10. 0 from over100k;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4904) A little more CP crossing RS boundaries

2013-08-21 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4904:
--

Attachment: HIVE-4904.D11757.2.patch

navis updated the revision HIVE-4904 [jira] A little more CP crossing RS 
boundaries.

  Rebased to trunk

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D11757

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D11757?vs=35979id=38589#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/test/results/clientpositive/auto_join18.q.out
  ql/src/test/results/clientpositive/auto_join18_multi_distinct.q.out
  ql/src/test/results/clientpositive/auto_join27.q.out
  ql/src/test/results/clientpositive/auto_join30.q.out
  ql/src/test/results/clientpositive/auto_join31.q.out
  ql/src/test/results/clientpositive/auto_join32.q.out
  ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out
  ql/src/test/results/clientpositive/count.q.out
  ql/src/test/results/clientpositive/groupby2_map.q.out
  ql/src/test/results/clientpositive/groupby2_map_multi_distinct.q.out
  ql/src/test/results/clientpositive/groupby2_map_skew.q.out
  ql/src/test/results/clientpositive/groupby3_map.q.out
  ql/src/test/results/clientpositive/groupby3_map_multi_distinct.q.out
  ql/src/test/results/clientpositive/groupby3_map_skew.q.out
  ql/src/test/results/clientpositive/groupby_cube1.q.out
  ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out
  ql/src/test/results/clientpositive/groupby_map_ppr.q.out
  ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out
  ql/src/test/results/clientpositive/groupby_multi_insert_common_distinct.q.out
  ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out
  ql/src/test/results/clientpositive/groupby_position.q.out
  ql/src/test/results/clientpositive/groupby_rollup1.q.out
  ql/src/test/results/clientpositive/groupby_sort_11.q.out
  ql/src/test/results/clientpositive/groupby_sort_8.q.out
  ql/src/test/results/clientpositive/join18.q.out
  ql/src/test/results/clientpositive/join18_multi_distinct.q.out
  ql/src/test/results/clientpositive/metadataonly1.q.out
  ql/src/test/results/clientpositive/multi_insert_gby2.q.out
  ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out
  ql/src/test/results/clientpositive/nullgroup.q.out
  ql/src/test/results/clientpositive/nullgroup2.q.out
  ql/src/test/results/clientpositive/nullgroup4.q.out
  ql/src/test/results/clientpositive/nullgroup4_multi_distinct.q.out
  ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out
  ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out
  ql/src/test/results/clientpositive/udf_count.q.out
  ql/src/test/results/clientpositive/union11.q.out
  ql/src/test/results/clientpositive/union14.q.out
  ql/src/test/results/clientpositive/union15.q.out
  ql/src/test/results/clientpositive/union16.q.out
  ql/src/test/results/clientpositive/union2.q.out
  ql/src/test/results/clientpositive/union25.q.out
  ql/src/test/results/clientpositive/union28.q.out
  ql/src/test/results/clientpositive/union3.q.out
  ql/src/test/results/clientpositive/union30.q.out
  ql/src/test/results/clientpositive/union31.q.out
  ql/src/test/results/clientpositive/union5.q.out
  ql/src/test/results/clientpositive/union7.q.out
  ql/src/test/results/clientpositive/union9.q.out
  ql/src/test/results/clientpositive/union_view.q.out
  ql/src/test/results/compiler/plan/groupby1.q.xml
  ql/src/test/results/compiler/plan/groupby2.q.xml
  ql/src/test/results/compiler/plan/groupby3.q.xml
  ql/src/test/results/compiler/plan/groupby5.q.xml

To: JIRA, navis


 A little more CP crossing RS boundaries
 ---

 Key: HIVE-4904
 URL: https://issues.apache.org/jira/browse/HIVE-4904
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-4904.D11757.1.patch, HIVE-4904.D11757.2.patch


 Currently, CP context cannot be propagated over RS except for JOIN/EXT. A 
 little more CP is possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4904) A little more CP crossing RS boundaries

2013-08-21 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747256#comment-13747256
 ] 

Navis commented on HIVE-4904:
-

[~yhuai] Sorry. I've missed your message. 
RS is CPed only when the child of it is JOIN operator and SELECT skips CP if 
any of children is FS/UNION/UDTF, etc.

 A little more CP crossing RS boundaries
 ---

 Key: HIVE-4904
 URL: https://issues.apache.org/jira/browse/HIVE-4904
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-4904.D11757.1.patch, HIVE-4904.D11757.2.patch


 Currently, CP context cannot be propagated over RS except for JOIN/EXT. A 
 little more CP is possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD

2013-08-21 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4293:
--

Attachment: HIVE-4293.D9933.5.patch

navis updated the revision HIVE-4293 [jira] Predicates following UDTF operator 
are removed by PPD.

  Rebased to trunk

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D9933

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D9933?vs=35949id=38595#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/LateralViewJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/LateralViewJoinDesc.java
  ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java
  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java
  ql/src/test/queries/clientpositive/lateral_view_ppd.q
  ql/src/test/queries/clientpositive/ppd_udtf.q
  ql/src/test/results/clientpositive/cluster.q.out
  ql/src/test/results/clientpositive/ctas_colname.q.out
  ql/src/test/results/clientpositive/lateral_view_ppd.q.out
  ql/src/test/results/clientpositive/ppd2.q.out
  ql/src/test/results/clientpositive/ppd_gby.q.out
  ql/src/test/results/clientpositive/ppd_gby2.q.out
  ql/src/test/results/clientpositive/ppd_udtf.q.out
  ql/src/test/results/clientpositive/udtf_json_tuple.q.out
  ql/src/test/results/clientpositive/udtf_parse_url_tuple.q.out
  ql/src/test/results/compiler/plan/join1.q.xml
  ql/src/test/results/compiler/plan/join2.q.xml
  ql/src/test/results/compiler/plan/join3.q.xml
  ql/src/test/results/compiler/plan/join4.q.xml
  ql/src/test/results/compiler/plan/join5.q.xml
  ql/src/test/results/compiler/plan/join6.q.xml
  ql/src/test/results/compiler/plan/join7.q.xml
  ql/src/test/results/compiler/plan/join8.q.xml

To: JIRA, navis


 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, 
 HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira