[jira] [Commented] (HIVE-6157) Fetching column stats slower than the 101 during rush hour

2014-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878388#comment-13878388
 ] 

Hive QA commented on HIVE-6157:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624267/HIVE-6157.01.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4943 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataonly1
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/980/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/980/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624267

 Fetching column stats slower than the 101 during rush hour
 --

 Key: HIVE-6157
 URL: https://issues.apache.org/jira/browse/HIVE-6157
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Sergey Shelukhin
 Attachments: HIVE-6157.01.patch, HIVE-6157.01.patch, 
 HIVE-6157.nogen.patch, HIVE-6157.prelim.patch


 hive.stats.fetch.column.stats controls whether the column stats for a table 
 are fetched during explain (in Tez: during query planning). On my setup (1 
 table 4000 partitions, 24 columns) the time spent in semantic analyze goes 
 from ~1 second to ~66 seconds when turning the flag on. 65 seconds spent 
 fetching column stats...
 The reason is probably that the APIs force you to make separate metastore 
 calls for each column in each partition. That's probably the first thing that 
 has to change. The question is if in addition to that we need to cache this 
 in the client or store the stats as a single blob in the database to further 
 cut down on the time. However, the way it stands right now column stats seem 
 unusable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD

2014-01-22 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4293:


Attachment: HIVE-4293.9.patch.txt

 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: D9933.6.patch, HIVE-4293.7.patch.txt, 
 HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, 
 HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, 
 HIVE-4293.D9933.5.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 16281: Predicates following UDTF operator are removed by PPD

2014-01-22 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16281/
---

(Updated Jan. 22, 2014, 8:31 a.m.)


Review request for hive.


Changes
---

Rebased to trunk


Bugs: HIVE-4293
https://issues.apache.org/jira/browse/HIVE-4293


Repository: hive-git


Description
---

For example, 
{noformat}
explain SELECT value from (
  select explode(array(key, value)) as (value) from (
select * FROM src WHERE key  200
  ) A
) B WHERE value  300
;
{noformat}

Makes plan like this, removing last predicates
{noformat}
  TableScan
alias: src
Filter Operator
  predicate:
  expr: (key  200.0)
  type: boolean
  Select Operator
expressions:
  expr: array(key,value)
  type: arraystring
outputColumnNames: _col0
UDTF Operator
  function name: explode
  Select Operator
expressions:
  expr: col
  type: string
outputColumnNames: _col0
File Output Operator
  compressed: false
  GlobalTableId: 0
  table:
  input format: org.apache.hadoop.mapred.TextInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
{noformat}


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/LateralViewJoinOperator.java 
2fbb81b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java c378dc7 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
0798470 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1f7aae0 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LateralViewJoinDesc.java ebfcfc8 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java 6a3dd99 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java cd5ae51 
  ql/src/test/queries/clientpositive/lateral_view_ppd.q 7be86a6 
  ql/src/test/queries/clientpositive/ppd_udtf.q PRE-CREATION 
  ql/src/test/results/clientpositive/cluster.q.out 8d14a1d 
  ql/src/test/results/clientpositive/ctas_colname.q.out a15b698 
  ql/src/test/results/clientpositive/lateral_view_ppd.q.out f54c809 
  ql/src/test/results/clientpositive/ppd2.q.out f6af8f8 
  ql/src/test/results/clientpositive/ppd_gby.q.out 5908450 
  ql/src/test/results/clientpositive/ppd_gby2.q.out bdd7e89 
  ql/src/test/results/clientpositive/ppd_udtf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/udtf_json_tuple.q.out 1a480b6 
  ql/src/test/results/clientpositive/udtf_parse_url_tuple.q.out a38b31b 
  ql/src/test/results/compiler/plan/join1.q.xml dbb8ca9 
  ql/src/test/results/compiler/plan/join2.q.xml d13890e 
  ql/src/test/results/compiler/plan/join3.q.xml 81ce3e2 
  ql/src/test/results/compiler/plan/join4.q.xml 116f2ad 
  ql/src/test/results/compiler/plan/join5.q.xml 9dd4af5 
  ql/src/test/results/compiler/plan/join6.q.xml 7134e08 
  ql/src/test/results/compiler/plan/join7.q.xml 9b7103e 
  ql/src/test/results/compiler/plan/join8.q.xml 7e2834f 

Diff: https://reviews.apache.org/r/16281/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Commented] (HIVE-6137) Hive should report that the file/path doesn’t exist when it doesn’t (it now reports SocketTimeoutException)

2014-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878471#comment-13878471
 ] 

Hive QA commented on HIVE-6137:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624242/HIVE-6137.1.patch

{color:green}SUCCESS:{color} +1 4943 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/981/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/981/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624242

 Hive should report that the file/path doesn’t exist when it doesn’t (it now 
 reports SocketTimeoutException)
 ---

 Key: HIVE-6137
 URL: https://issues.apache.org/jira/browse/HIVE-6137
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6137.1.patch


 Hive should report that the file/path doesn’t exist when it doesn’t (it now 
 reports SocketTimeoutException):
 Execute a Hive DDL query with a reference to a non-existent blob (such as 
 CREATE EXTERNAL TABLE...) and check Hive logs (stderr):
 FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: 
 java.net.SocketTimeoutException: Read timed out
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 This error message is not intuitive. If a file doesn't exist, Hive should 
 report FileNotFoundException



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6243) error in high-precision division for Decimal128

2014-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878473#comment-13878473
 ] 

Hive QA commented on HIVE-6243:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624250/HIVE-6243.01.patch

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/983/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/983/console

Messages:
{noformat}
 This message was trimmed, see log for full details 
Decision can match input such as LPAREN KW_CASE KW_NOT using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:68:4: 
Decision can match input such as LPAREN LPAREN KW_NOT using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:68:4: 
Decision can match input such as LPAREN KW_NULL KW_IN using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:68:4: 
Decision can match input such as LPAREN KW_CASE StringLiteral using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:68:4: 
Decision can match input such as LPAREN KW_NULL LSQUARE using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:108:5: 
Decision can match input such as KW_ORDER KW_BY LPAREN using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:121:5: 
Decision can match input such as KW_CLUSTER KW_BY LPAREN using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:133:5: 
Decision can match input such as KW_PARTITION KW_BY LPAREN using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:144:5: 
Decision can match input such as KW_DISTRIBUTE KW_BY LPAREN using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:155:5: 
Decision can match input such as KW_SORT KW_BY LPAREN using multiple 
alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:172:7: 
Decision can match input such as STAR using multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input
warning(200): IdentifiersParser.g:185:5: 
Decision can match input such as KW_ARRAY using multiple alternatives: 2, 6

As a result, alternative(s) 6 were disabled for that input
warning(200): IdentifiersParser.g:185:5: 
Decision can match input such as KW_STRUCT using multiple alternatives: 4, 6

As a result, alternative(s) 6 were disabled for that input
warning(200): IdentifiersParser.g:185:5: 
Decision can match input such as KW_UNIONTYPE using multiple alternatives: 5, 
6

As a result, alternative(s) 6 were disabled for that input
warning(200): IdentifiersParser.g:267:5: 
Decision can match input such as KW_NULL using multiple alternatives: 1, 8

As a result, alternative(s) 8 were disabled for that input
warning(200): IdentifiersParser.g:267:5: 
Decision can match input such as KW_TRUE using multiple alternatives: 3, 8

As a result, alternative(s) 8 were disabled for that input
warning(200): IdentifiersParser.g:267:5: 
Decision can match input such as KW_FALSE using multiple alternatives: 3, 8

As a result, alternative(s) 8 were disabled for that input
warning(200): IdentifiersParser.g:267:5: 
Decision can match input such as KW_DATE StringLiteral using multiple 
alternatives: 2, 3

As a result, alternative(s) 3 were disabled for that input
warning(200): IdentifiersParser.g:399:5: 
Decision can match input such as KW_BETWEEN KW_MAP LPAREN using multiple 
alternatives: 8, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:399:5: 
Decision can match input such as {KW_LIKE, KW_REGEXP, KW_RLIKE} KW_DISTRIBUTE 
KW_BY using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:399:5: 
Decision can match input such as {KW_LIKE, KW_REGEXP, KW_RLIKE} KW_INSERT 
KW_INTO using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:399:5: 
Decision can match input such as {KW_LIKE, KW_REGEXP, KW_RLIKE} KW_INSERT 
KW_OVERWRITE using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:399:5: 
Decision can match input such as {KW_LIKE, KW_REGEXP, KW_RLIKE} KW_SORT KW_BY 
using 

[jira] [Commented] (HIVE-6229) Stats are missing sometimes (regression from HIVE-5936)

2014-01-22 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878510#comment-13878510
 ] 

Lefty Leverenz commented on HIVE-6229:
--

This adds hive.stats.key.prefix.reserve.length to HiveConf.java and 
hive-default.xml.template.

I can document it in the wiki (for Hive 0.13.0) along with 
hive.stats.key.prefix.max.length, but hive.stats.key.prefix has an internal 
usage only comment so are these other two hive.stats.key.prefix.* config 
params also internal use only?

 Stats are missing sometimes (regression from HIVE-5936)
 ---

 Key: HIVE-6229
 URL: https://issues.apache.org/jira/browse/HIVE-6229
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Navis
Assignee: Navis
 Fix For: 0.13.0

 Attachments: HIVE-6229.1.patch.txt, HIVE-6229.2.patch.txt


 if prefix length is smaller than hive.stats.key.prefix.max.length but length 
 of prefix + postfix is bigger than that, stats are missed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-3552) HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys

2014-01-22 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878542#comment-13878542
 ] 

Lefty Leverenz commented on HIVE-3552:
--

This adds hive.new.job.grouping.set.cardinality to HiveConf.java and 
hive-default.xml.template.

Also documented in the wiki, with a link to this JIRA ticket:  [Query Execution 
|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryExecution]
 (search for grouping).

 HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a 
 high number of grouping set keys
 -

 Key: HIVE-3552
 URL: https://issues.apache.org/jira/browse/HIVE-3552
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.11.0

 Attachments: hive.3552.1.patch, hive.3552.10.patch, 
 hive.3552.11.patch, hive.3552.12.patch, hive.3552.2.patch, hive.3552.3.patch, 
 hive.3552.4.patch, hive.3552.5.patch, hive.3552.6.patch, hive.3552.7.patch, 
 hive.3552.8.patch, hive.3552.9.patch


 This is a follow up for HIVE-3433.
 Had a offline discussion with Sambavi - she pointed out a scenario where the
 implementation in HIVE-3433 will not scale. Assume that the user is performing
 a cube on many columns, say '8' columns. So, each row would generate 256 rows
 for the hash table, which may kill the current group by implementation.
 A better implementation would be to add an additional mr job - in the first 
 mr job perform the group by assuming there was no cube. Add another mr job, 
 where
 you would perform the cube. The assumption is that the group by would have 
 decreased the output data significantly, and the rows would appear in the 
 order of
 grouping keys which has a higher probability of hitting the hash table.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6205) alter table partition column throws NPE in authorization

2014-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878544#comment-13878544
 ] 

Hive QA commented on HIVE-6205:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624251/HIVE-6205.4.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4943 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveOperationType.checkHiveOperationTypeMatch
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/984/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/984/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624251

 alter table partition column throws NPE in authorization
 --

 Key: HIVE-6205
 URL: https://issues.apache.org/jira/browse/HIVE-6205
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6205.1.patch.txt, HIVE-6205.2.patch.txt, 
 HIVE-6205.3.patch.txt, HIVE-6205.4.patch.txt


 alter table alter_coltype partition column (dt int);
 {noformat}
 2014-01-15 15:53:40,364 ERROR ql.Driver (SessionState.java:printError(457)) - 
 FAILED: NullPointerException null
 java.lang.NullPointerException
   at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:599)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:479)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:340)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:996)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1039)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:932)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:922)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
 {noformat}
 Operation for TOK_ALTERTABLE_ALTERPARTS is not defined.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-3552) HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys

2014-01-22 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878581#comment-13878581
 ] 

Lefty Leverenz commented on HIVE-3552:
--

The main wikidoc for this is here: 

* [Enhanced Aggregation, Cube, Grouping and Rollup 
|https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup]

 HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a 
 high number of grouping set keys
 -

 Key: HIVE-3552
 URL: https://issues.apache.org/jira/browse/HIVE-3552
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.11.0

 Attachments: hive.3552.1.patch, hive.3552.10.patch, 
 hive.3552.11.patch, hive.3552.12.patch, hive.3552.2.patch, hive.3552.3.patch, 
 hive.3552.4.patch, hive.3552.5.patch, hive.3552.6.patch, hive.3552.7.patch, 
 hive.3552.8.patch, hive.3552.9.patch


 This is a follow up for HIVE-3433.
 Had a offline discussion with Sambavi - she pointed out a scenario where the
 implementation in HIVE-3433 will not scale. Assume that the user is performing
 a cube on many columns, say '8' columns. So, each row would generate 256 rows
 for the hash table, which may kill the current group by implementation.
 A better implementation would be to add an additional mr job - in the first 
 mr job perform the group by assuming there was no cube. Add another mr job, 
 where
 you would perform the cube. The assumption is that the group by would have 
 decreased the output data significantly, and the rows would appear in the 
 order of
 grouping keys which has a higher probability of hitting the hash table.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6144) Implement non-staged MapJoin

2014-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878613#comment-13878613
 ] 

Hive QA commented on HIVE-6144:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624256/HIVE-6144.6.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4944 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_dyn_part
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_deletejar
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/985/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/985/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624256

 Implement non-staged MapJoin
 

 Key: HIVE-6144
 URL: https://issues.apache.org/jira/browse/HIVE-6144
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6144.1.patch.txt, HIVE-6144.2.patch.txt, 
 HIVE-6144.3.patch.txt, HIVE-6144.4.patch.txt, HIVE-6144.5.patch.txt, 
 HIVE-6144.6.patch.txt


 For map join, all data in small aliases are hashed and stored into temporary 
 file in MapRedLocalTask. But for some aliases without filter or projection, 
 it seemed not necessary to do that. For example.
 {noformat}
 select a.* from src a join src b on a.key=b.key;
 {noformat}
 makes plan like this.
 {noformat}
 STAGE PLANS:
   Stage: Stage-4
 Map Reduce Local Work
   Alias - Map Local Tables:
 a 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 a 
   TableScan
 alias: a
 HashTable Sink Operator
   condition expressions:
 0 {key} {value}
 1 
   handleSkewJoin: false
   keys:
 0 [Column[key]]
 1 [Column[key]]
   Position of Big Table: 1
   Stage: Stage-3
 Map Reduce
   Alias - Map Operator Tree:
 b 
   TableScan
 alias: b
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {key} {value}
 1 
   handleSkewJoin: false
   keys:
 0 [Column[key]]
 1 [Column[key]]
   outputColumnNames: _col0, _col1
   Position of Big Table: 1
   Select Operator
 File Output Operator
   Local Work:
 Map Reduce Local Work
   Stage: Stage-0
 Fetch Operator
 {noformat}
 table src(a) is fetched and stored as-is in MRLocalTask. With this patch, 
 plan can be like below.
 {noformat}
   Stage: Stage-3
 Map Reduce
   Alias - Map Operator Tree:
 b 
   TableScan
 alias: b
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {key} {value}
 1 
   handleSkewJoin: false
   keys:
 0 [Column[key]]
 1 [Column[key]]
   outputColumnNames: _col0, _col1
   Position of Big Table: 1
   Select Operator
   File Output Operator
   Local Work:
 Map Reduce Local Work
   Alias - Map Local Tables:
 a 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 a 
   TableScan
 alias: a
   Has Any Stage Alias: false
   Stage: Stage-0
 Fetch Operator
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6209) 'LOAD DATA INPATH ... OVERWRITE ..' doesn't overwrite current data

2014-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878685#comment-13878685
 ] 

Hive QA commented on HIVE-6209:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624268/HIVE-6209.1.patch

{color:green}SUCCESS:{color} +1 4944 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/987/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/987/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624268

 'LOAD DATA INPATH ... OVERWRITE ..' doesn't overwrite current data
 --

 Key: HIVE-6209
 URL: https://issues.apache.org/jira/browse/HIVE-6209
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6209.1.patch, HIVE-6209.patch


 In case where user loads data into table using overwrite, using a different 
 file, it is not being overwritten.
 {code}
 $ hdfs dfs -cat /tmp/data
 aaa
 bbb
 ccc
 $ hdfs dfs -cat /tmp/data2
 ddd
 eee
 fff
 $ hive
 hive create table test (id string); 
 hive load data inpath '/tmp/data' overwrite into table test;
 hive select * from test;
 aaa
 bbb
 ccc
 hive load data inpath '/tmp/data2' overwrite into table test;
 hive select * from test;
 aaa
 bbb
 ccc
 ddd
 eee
 fff
 {code}
 It seems it is broken by HIVE-3756 which added another condition to whether 
 rmr should be run on old directory, and skips in this case.
 There is a workaround of set fs.hdfs.impl.disable.cache=true; 
 which sabotages this condition, but this condition should be removed in 
 long-term.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5799) session/operation timeout for hiveserver2

2014-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878792#comment-13878792
 ] 

Hive QA commented on HIVE-5799:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624287/HIVE-5799.6.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4943 tests executed
*Failed tests:*
{noformat}
org.apache.hcatalog.hbase.snapshot.lock.TestWriteLock.testRun
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/988/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/988/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624287

 session/operation timeout for hiveserver2
 -

 Key: HIVE-5799
 URL: https://issues.apache.org/jira/browse/HIVE-5799
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-5799.1.patch.txt, HIVE-5799.2.patch.txt, 
 HIVE-5799.3.patch.txt, HIVE-5799.4.patch.txt, HIVE-5799.5.patch.txt, 
 HIVE-5799.6.patch.txt


 Need some timeout facility for preventing resource leakages from instable  or 
 bad clients.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5002) Loosen readRowIndex visibility in ORC's RecordReaderImpl to package private

2014-01-22 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-5002:


   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks for the review, [~hagleitn]

 Loosen readRowIndex visibility in ORC's RecordReaderImpl to package private
 ---

 Key: HIVE-5002
 URL: https://issues.apache.org/jira/browse/HIVE-5002
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.12.0
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.13.0

 Attachments: HIVE-5002.2.patch, HIVE-5002.D12015.1.patch, 
 h-5002.patch, h-5002.patch


 Some users want to be able to access the rowIndexes directly from ORC reader 
 extensions.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5843) Transaction manager for Hive

2014-01-22 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878842#comment-13878842
 ] 

Alan Gates commented on HIVE-5843:
--

bq. Therefore do we see these methods being called in tight loops? It seems 
that the IO in this situation would dominate any unwrapping of objects?
I don't see this in tight loops, but whenever this is turned on we are adding a 
round trip to the metastore on every read query (for lock acquisition) and two 
for inserts (txn open plus lock acquisition).  I'd just like to keep those as 
fast as possible.  I'll do some micro-benchmarking and if I can't show that it 
costs something extra I'll move the single integer values to a struct.  A 
question though.  Can thrift handle adding new parameters to functions?  It 
would seem a natural thing to support given thrift's version independence, but 
I couldn't find anything one way or another in the docs.

bq. Additionally I would not throw an exceptions as they are problematic as well
What's wrong with exceptions in thrift?

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-5843-src-only.patch, HIVE-5843.2.patch, 
 HIVE-5843.3-src.path, HIVE-5843.3.patch, HIVE-5843.patch, 
 HiveTransactionManagerDetailedDesign (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6209) 'LOAD DATA INPATH ... OVERWRITE ..' doesn't overwrite current data

2014-01-22 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878857#comment-13878857
 ] 

Prasad Mujumdar commented on HIVE-6209:
---

Thanks for addressing the suggestions. Looks fine to me.

+1


 'LOAD DATA INPATH ... OVERWRITE ..' doesn't overwrite current data
 --

 Key: HIVE-6209
 URL: https://issues.apache.org/jira/browse/HIVE-6209
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6209.1.patch, HIVE-6209.patch


 In case where user loads data into table using overwrite, using a different 
 file, it is not being overwritten.
 {code}
 $ hdfs dfs -cat /tmp/data
 aaa
 bbb
 ccc
 $ hdfs dfs -cat /tmp/data2
 ddd
 eee
 fff
 $ hive
 hive create table test (id string); 
 hive load data inpath '/tmp/data' overwrite into table test;
 hive select * from test;
 aaa
 bbb
 ccc
 hive load data inpath '/tmp/data2' overwrite into table test;
 hive select * from test;
 aaa
 bbb
 ccc
 ddd
 eee
 fff
 {code}
 It seems it is broken by HIVE-3756 which added another condition to whether 
 rmr should be run on old directory, and skips in this case.
 There is a workaround of set fs.hdfs.impl.disable.cache=true; 
 which sabotages this condition, but this condition should be removed in 
 long-term.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive

2014-01-22 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-5728:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

I committed this. Thanks, Daniel!

 Make ORC InputFormat/OutputFormat usable outside Hive
 -

 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5728-1.patch, HIVE-5728-2.patch, HIVE-5728-3.patch, 
 HIVE-5728-4.patch, HIVE-5728-5.patch, HIVE-5728-6.patch, HIVE-5728-7.patch, 
 HIVE-5728-8.patch


 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
 several issues need to solve:
 1. Several class is not public, eg: OrcStruct
 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
 need new api)
 3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-4367) enhance TRUNCATE syntex to drop data of external table

2014-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878877#comment-13878877
 ] 

Hive QA commented on HIVE-4367:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624284/HIVE-4367.2.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4943 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_truncate_table_failure1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_truncate_table_failure3
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_truncate_table_failure4
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/989/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/989/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624284

 enhance  TRUNCATE syntex  to drop data of external table
 

 Key: HIVE-4367
 URL: https://issues.apache.org/jira/browse/HIVE-4367
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: caofangkun
Assignee: caofangkun
Priority: Minor
 Attachments: HIVE-4367-1.patch, HIVE-4367.2.patch.txt


 In my use case ,
 sometimes I have to remove data of external tables to free up storage space 
 of the cluster .
 So it's necessary for to enhance the syntax like 
 TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE;
 to remove data from EXTERNAL table.
 And I add a configuration property to enable remove data to Trash 
 property
   namehive.truncate.skiptrash/name
   valuefalse/value
   description
  if true will remove data to trash, else false drop data immediately
   /description
 /property
 For example :
 hive (default) TRUNCATE TABLE external1 partition (ds='11'); 
 FAILED: Error in semantic analysis: Cannot truncate non-managed table 
 external1
 hive (default) TRUNCATE TABLE external1 partition (ds='11') FORCE;
 [2013-04-16 17:15:52]: Compile Start 
 [2013-04-16 17:15:52]: Compile End
 [2013-04-16 17:15:52]: OK
 [2013-04-16 17:15:52]: Time taken: 0.413 seconds
 hive (default) set hive.truncate.skiptrash;
 hive.truncate.skiptrash=false
 hive (default) set hive.truncate.skiptrash=true; 
 hive (default) TRUNCATE TABLE external1 partition (ds='12') FORCE;
 [2013-04-16 17:16:21]: Compile Start 
 [2013-04-16 17:16:21]: Compile End
 [2013-04-16 17:16:21]: OK
 [2013-04-16 17:16:21]: Time taken: 0.143 seconds
 hive (default) dfs -ls /user/test/.Trash/Current/; 
 Found 1 items
 drwxr-xr-x -test supergroup 0 2013-04-16 17:06 /user/test/.Trash/Current/ds=11



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5155) Support secure proxy user access to HiveServer2

2014-01-22 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878955#comment-13878955
 ] 

Owen O'Malley commented on HIVE-5155:
-

Since this patch is adding proxy users for HiveServer2, it really should 
include the same ability to limit the authority of the proxy users that the 
other Hadoop tools have. To make it consistent with Hadoop, the configuration 
would look like:

{code}
 property
namehive.server2.proxyuser.HS.hosts/name
valuehost1,host2/value
  /property
  property
namehive.server2.proxyuser.HS.groups/name
valuegroup1,group2/value
  /property
{code}

which configures HS as a hive server2 proxy user and limits it to working on a 
specified set of hosts (or * for all) and impersonating a specified group of 
users (or * for all).

 Support secure proxy user access to HiveServer2
 ---

 Key: HIVE-5155
 URL: https://issues.apache.org/jira/browse/HIVE-5155
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, HiveServer2, JDBC
Affects Versions: 0.12.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5155-1-nothrift.patch, HIVE-5155-noThrift.2.patch, 
 HIVE-5155-noThrift.4.patch, HIVE-5155-noThrift.5.patch, 
 HIVE-5155-noThrift.6.patch, HIVE-5155.1.patch, HIVE-5155.2.patch, 
 HIVE-5155.3.patch, ProxyAuth.java, ProxyAuth.out, TestKERBEROS_Hive_JDBC.java


 The HiveServer2 can authenticate a client using via Kerberos and impersonate 
 the connecting user with underlying secure hadoop. This becomes a gateway for 
 a remote client to access secure hadoop cluster. Now this works fine for when 
 the client obtains Kerberos ticket and directly connects to HiveServer2. 
 There's another big use case for middleware tools where the end user wants to 
 access Hive via another server. For example Oozie action or Hue submitting 
 queries or a BI tool server accessing to HiveServer2. In these cases, the 
 third party server doesn't have end user's Kerberos credentials and hence it 
 can't submit queries to HiveServer2 on behalf of the end user.
 This ticket is for enabling proxy access to HiveServer2 for third party tools 
 on behalf of end users. There are two parts of the solution proposed in this 
 ticket:
 1) Delegation token based connection for Oozie (OOZIE-1457)
 This is the common mechanism for Hadoop ecosystem components. Hive Remote 
 Metastore and HCatalog already support this. This is suitable for tool like 
 Oozie that submits the MR jobs as actions on behalf of its client. Oozie 
 already uses similar mechanism for Metastore/HCatalog access.
 2) Direct proxy access for privileged hadoop users
 The delegation token implementation can be a challenge for non-hadoop 
 (especially non-java) components. This second part enables a privileged user 
 to directly specify an alternate session user during the connection. If the 
 connecting user has hadoop level privilege to impersonate the requested 
 userid, then HiveServer2 will run the session as that requested user. For 
 example, user Hue is allowed to impersonate user Bob (via core-site.xml proxy 
 user configuration). Then user Hue can connect to HiveServer2 and specify Bob 
 as session user via a session property. HiveServer2 will verify Hue's proxy 
 user privilege and then impersonate user Bob instead of Hue. This will enable 
 any third party tool to impersonate alternate userid without having to 
 implement delegation token connection.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6181) support grant/revoke on views - parser changes

2014-01-22 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6181:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.

 support grant/revoke on views - parser changes
 --

 Key: HIVE-6181
 URL: https://issues.apache.org/jira/browse/HIVE-6181
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-6181.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 Support grant/revoke statements on views. Includes parser changes.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5843) Transaction manager for Hive

2014-01-22 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878974#comment-13878974
 ] 

Brock Noland commented on HIVE-5843:


bq. I'll do some micro-benchmarking and if I can't show that it costs something 
extra I'll move the single integer values to a struct.

Awesome!

bq. Can thrift handle adding new parameters to functions? It would seem a 
natural thing to support given thrift's version independence, but I couldn't 
find anything one way or another in the docs.

Agreed, you would think that it would be a priority...but AFAIK that is an 
issue. Thus the use of a single Request object per method call.

bq. What's wrong with exceptions in thrift?

I suppose the method signatures throw TException which is the parent class of 
all generated thrift exceptions?  If so, then I don't suppose there is an issue.

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-5843-src-only.patch, HIVE-5843.2.patch, 
 HIVE-5843.3-src.path, HIVE-5843.3.patch, HIVE-5843.patch, 
 HiveTransactionManagerDetailedDesign (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6183) Implement vectorized type cast from/to decimal(p, s)

2014-01-22 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6183:
--

Attachment: HIVE-6183.09.patch

Addressed all Jitendra's comments.

 Implement vectorized type cast from/to decimal(p, s)
 

 Key: HIVE-6183
 URL: https://issues.apache.org/jira/browse/HIVE-6183
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-6183.07.patch, HIVE-6183.08.patch, 
 HIVE-6183.09.patch


 Add support for all the type supported type casts to/from decimal(p,s) in 
 vectorized mode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6183) Implement vectorized type cast from/to decimal(p, s)

2014-01-22 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878985#comment-13878985
 ] 

Eric Hanson commented on HIVE-6183:
---

https://reviews.apache.org/r/17194/

 Implement vectorized type cast from/to decimal(p, s)
 

 Key: HIVE-6183
 URL: https://issues.apache.org/jira/browse/HIVE-6183
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-6183.07.patch, HIVE-6183.08.patch, 
 HIVE-6183.09.patch


 Add support for all the type supported type casts to/from decimal(p,s) in 
 vectorized mode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive

2014-01-22 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879006#comment-13879006
 ] 

Daniel Dai commented on HIVE-5728:
--

Thanks [~owen.omalley]!

 Make ORC InputFormat/OutputFormat usable outside Hive
 -

 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5728-1.patch, HIVE-5728-2.patch, HIVE-5728-3.patch, 
 HIVE-5728-4.patch, HIVE-5728-5.patch, HIVE-5728-6.patch, HIVE-5728-7.patch, 
 HIVE-5728-8.patch


 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
 several issues need to solve:
 1. Several class is not public, eg: OrcStruct
 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
 need new api)
 3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6259) Support truncate for non-native tables

2014-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879042#comment-13879042
 ] 

Hive QA commented on HIVE-6259:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624285/HIVE-6259.1.patch.txt

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 4948 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_date2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_merge
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_top_level
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_truncate_column_buckets
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_truncate_bucketed_column
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_truncate_column_indexed_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_truncate_column_list_bucketing
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_truncate_column_seqfile
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_truncate_nonexistant_column
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_truncate_partition_column
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_truncate_partition_column2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_truncate_table_failure1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_truncate_table_failure2
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/990/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/990/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624285

 Support truncate for non-native tables
 --

 Key: HIVE-6259
 URL: https://issues.apache.org/jira/browse/HIVE-6259
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-6259.1.patch.txt


 Tables on HBase might be truncated by similar method in HBaseShell.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6068) HiveServer2 client on windows does not handle the non-ascii characters properly

2014-01-22 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6068:
---

Summary: HiveServer2 client on windows does not handle the non-ascii 
characters properly  (was: HiveServer2 beeline client on windows does not 
handle the non-ascii characters properly)

 HiveServer2 client on windows does not handle the non-ascii characters 
 properly
 ---

 Key: HIVE-6068
 URL: https://issues.apache.org/jira/browse/HIVE-6068
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
 Environment: Windows 
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


 When running a select query against a table which contains rows with 
 non-ascii characters HiveServer2 Beeline client returns them wrong. Example:
 {noformat}
 738;Garçu, Le (1995);Drama
 741;Ghost in the Shell (Kôkaku kidôtai) (1995);Animation|Sci-Fi
 {noformat}
 come out from a HiveServer2 beeline client as:
 {noformat}
 '738' 'Gar?u, Le (1995)'  'Drama'
 '741' 'Ghost in the Shell (K?kaku kid?tai) (1995)''Animation|Sci-Fi'
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6264) Unbalanced number of HiveParser msgs.push/msgs.pop calls when doing lookahead

2014-01-22 Thread Jason Dere (JIRA)
Jason Dere created HIVE-6264:


 Summary: Unbalanced number of HiveParser msgs.push/msgs.pop calls 
when doing lookahead
 Key: HIVE-6264
 URL: https://issues.apache.org/jira/browse/HIVE-6264
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


HiveParser pushes/pops messages describing the current parse rule like so:
{noformat}
joinSource
@init { gParent.msgs.push(join source); }
@after { gParent.msgs.pop(); }
...
{noformat}

The ANTLR generated code for the init/after actions looks like this:

{noformat}
 gParent.msgs.push(join source); 
...
if ( state.backtracking==0 ) { gParent.msgs.pop(); }
{noformat}

If we have a parse rule that does some lookahead, the message is always pushed 
onto the message stack since the init action has no check of 
state.backtracking.  But that message is never popped because the after action 
does check state.backtracking. As a result there can be a bunch of parser 
context messages added to the stack which are never taken off.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6264) Unbalanced number of HiveParser msgs.push/msgs.pop calls when doing lookahead

2014-01-22 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879050#comment-13879050
 ] 

Jason Dere commented on HIVE-6264:
--

Ran into this while trying some parser changes.  Thanks to [~rhbutani] for 
finding the issue.

 Unbalanced number of HiveParser msgs.push/msgs.pop calls when doing lookahead
 -

 Key: HIVE-6264
 URL: https://issues.apache.org/jira/browse/HIVE-6264
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere

 HiveParser pushes/pops messages describing the current parse rule like so:
 {noformat}
 joinSource
 @init { gParent.msgs.push(join source); }
 @after { gParent.msgs.pop(); }
 ...
 {noformat}
 The ANTLR generated code for the init/after actions looks like this:
 {noformat}
  gParent.msgs.push(join source); 
 ...
 if ( state.backtracking==0 ) { gParent.msgs.pop(); }
 {noformat}
 If we have a parse rule that does some lookahead, the message is always 
 pushed onto the message stack since the init action has no check of 
 state.backtracking.  But that message is never popped because the after 
 action does check state.backtracking. As a result there can be a bunch of 
 parser context messages added to the stack which are never taken off.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6264) Unbalanced number of HiveParser msgs.push/msgs.pop calls when doing lookahead

2014-01-22 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6264:
-

Attachment: HIVE-6264.1.patch

Patch v1. Add new pushMsg()/popMsg() methods in HiveParser which will check 
state.backtracking.

 Unbalanced number of HiveParser msgs.push/msgs.pop calls when doing lookahead
 -

 Key: HIVE-6264
 URL: https://issues.apache.org/jira/browse/HIVE-6264
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6264.1.patch


 HiveParser pushes/pops messages describing the current parse rule like so:
 {noformat}
 joinSource
 @init { gParent.msgs.push(join source); }
 @after { gParent.msgs.pop(); }
 ...
 {noformat}
 The ANTLR generated code for the init/after actions looks like this:
 {noformat}
  gParent.msgs.push(join source); 
 ...
 if ( state.backtracking==0 ) { gParent.msgs.pop(); }
 {noformat}
 If we have a parse rule that does some lookahead, the message is always 
 pushed onto the message stack since the init action has no check of 
 state.backtracking.  But that message is never popped because the after 
 action does check state.backtracking. As a result there can be a bunch of 
 parser context messages added to the stack which are never taken off.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6264) Unbalanced number of HiveParser msgs.push/msgs.pop calls when doing lookahead

2014-01-22 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6264:
-

Status: Patch Available  (was: Open)

 Unbalanced number of HiveParser msgs.push/msgs.pop calls when doing lookahead
 -

 Key: HIVE-6264
 URL: https://issues.apache.org/jira/browse/HIVE-6264
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6264.1.patch


 HiveParser pushes/pops messages describing the current parse rule like so:
 {noformat}
 joinSource
 @init { gParent.msgs.push(join source); }
 @after { gParent.msgs.pop(); }
 ...
 {noformat}
 The ANTLR generated code for the init/after actions looks like this:
 {noformat}
  gParent.msgs.push(join source); 
 ...
 if ( state.backtracking==0 ) { gParent.msgs.pop(); }
 {noformat}
 If we have a parse rule that does some lookahead, the message is always 
 pushed onto the message stack since the init action has no check of 
 state.backtracking.  But that message is never popped because the after 
 action does check state.backtracking. As a result there can be a bunch of 
 parser context messages added to the stack which are never taken off.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Review Request 17200: HIVE-6264: Unbalanced number of HiveParser msgs.push/msgs.pop calls when doing lookahead

2014-01-22 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17200/
---

Review request for hive.


Bugs: HIVE-6264
https://issues.apache.org/jira/browse/HIVE-6264


Repository: hive-git


Description
---

What looks like ANTLR bug causes @after actions to be wrapped in a backtracking 
check, but the @init action does not have such a check. This results in 
unbalanced number of push/pop calls to the parser message stack.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g 2adefcb 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g c15c4b5 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 4147503 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SelectClauseParser.g f4de252 

Diff: https://reviews.apache.org/r/17200/diff/


Testing
---


Thanks,

Jason Dere



[jira] [Commented] (HIVE-6264) Unbalanced number of HiveParser msgs.push/msgs.pop calls when doing lookahead

2014-01-22 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879079#comment-13879079
 ] 

Jason Dere commented on HIVE-6264:
--

https://reviews.apache.org/r/17200/

 Unbalanced number of HiveParser msgs.push/msgs.pop calls when doing lookahead
 -

 Key: HIVE-6264
 URL: https://issues.apache.org/jira/browse/HIVE-6264
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6264.1.patch


 HiveParser pushes/pops messages describing the current parse rule like so:
 {noformat}
 joinSource
 @init { gParent.msgs.push(join source); }
 @after { gParent.msgs.pop(); }
 ...
 {noformat}
 The ANTLR generated code for the init/after actions looks like this:
 {noformat}
  gParent.msgs.push(join source); 
 ...
 if ( state.backtracking==0 ) { gParent.msgs.pop(); }
 {noformat}
 If we have a parse rule that does some lookahead, the message is always 
 pushed onto the message stack since the init action has no check of 
 state.backtracking.  But that message is never popped because the after 
 action does check state.backtracking. As a result there can be a bunch of 
 parser context messages added to the stack which are never taken off.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17163: HIVE-5929 - SQL std auth - Access control statement updates

2014-01-22 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17163/#review32517
---



ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationPluginException.java
https://reviews.apache.org/r/17163/#comment61384

If we make it extend HiveException than we can avoid all those try-catches 
which catch this exception and rethrow HiveException.



ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizationValidator.java
https://reviews.apache.org/r/17163/#comment61385

This can be removed.



ql/src/test/queries/clientpositive/authorization_1_sql_std.q
https://reviews.apache.org/r/17163/#comment61392

If you don't intent to have this in test case, better to just delete it.



ql/src/test/queries/clientpositive/authorization_1_sql_std.q
https://reviews.apache.org/r/17163/#comment61393

Same as above.


- Ashutosh Chauhan


On Jan. 22, 2014, 12:17 a.m., Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17163/
 ---
 
 (Updated Jan. 22, 2014, 12:17 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Bugs: HIVE-5929
 https://issues.apache.org/jira/browse/HIVE-5929
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bd95161 
   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 92ed55b 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveUtils.java c65bf28 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/authorization/HiveAuthorizationTaskFactoryImpl.java
  c41cd0f 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/DefaultHiveAuthorizerFactory.java
  7470e9d 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAccessController.java
  8e4114f 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationPluginException.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationValidator.java
  63046f5 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizer.java
  c10a2ac 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizerImpl.java
  ca95bfc 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeInfo.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizationValidator.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizerFactory.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 08f7fec 
   ql/src/test/queries/clientpositive/authorization_1_sql_std.q PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_role_grant1.q 1a375a5 
   ql/src/test/results/clientpositive/authorization_1_sql_std.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_role_grant1.q.out 496687c 
 
 Diff: https://reviews.apache.org/r/17163/diff/
 
 
 Testing
 ---
 
 .q tests included.
 
 
 Thanks,
 
 Thejas Nair
 




[jira] [Commented] (HIVE-5929) SQL std auth - Access control statement updates

2014-01-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879104#comment-13879104
 ] 

Ashutosh Chauhan commented on HIVE-5929:


Left some comments on RB. Also, you may want to mark it Patch Available to get 
Hive QA run on it.

 SQL std auth - Access control statement updates
 ---

 Key: HIVE-5929
 URL: https://issues.apache.org/jira/browse/HIVE-5929
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
 Attachments: HIVE-5929.1.patch, HIVE-5929.2.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5929) SQL std auth - Access control statement updates

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5929:


Status: Patch Available  (was: Open)

 SQL std auth - Access control statement updates
 ---

 Key: HIVE-5929
 URL: https://issues.apache.org/jira/browse/HIVE-5929
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
 Attachments: HIVE-5929.1.patch, HIVE-5929.2.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17163: HIVE-5929 - SQL std auth - Access control statement updates

2014-01-22 Thread Thejas Nair


 On Jan. 22, 2014, 7:58 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationPluginException.java,
   line 27
  https://reviews.apache.org/r/17163/diff/2/?file=433855#file433855line27
 
  If we make it extend HiveException than we can avoid all those 
  try-catches which catch this exception and rethrow HiveException.

If we extend HiveException, HiveException has to become a public API, and so 
does ErrorMsg . It didn't seem like a good idea to make ErrorMsg public. Let me 
know if you think otherwise.


 On Jan. 22, 2014, 7:58 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizationValidator.java,
   line 32
  https://reviews.apache.org/r/17163/diff/2/?file=433861#file433861line32
 
  This can be removed.

fixing


 On Jan. 22, 2014, 7:58 p.m., Ashutosh Chauhan wrote:
  ql/src/test/queries/clientpositive/authorization_1_sql_std.q, line 13
  https://reviews.apache.org/r/17163/diff/2/?file=433864#file433864line13
 
  If you don't intent to have this in test case, better to just delete it.

fixing


 On Jan. 22, 2014, 7:58 p.m., Ashutosh Chauhan wrote:
  ql/src/test/queries/clientpositive/authorization_1_sql_std.q, line 27
  https://reviews.apache.org/r/17163/diff/2/?file=433864#file433864line27
 
  Same as above.

fixing


- Thejas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17163/#review32517
---


On Jan. 22, 2014, 12:17 a.m., Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17163/
 ---
 
 (Updated Jan. 22, 2014, 12:17 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Bugs: HIVE-5929
 https://issues.apache.org/jira/browse/HIVE-5929
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bd95161 
   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 92ed55b 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveUtils.java c65bf28 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/authorization/HiveAuthorizationTaskFactoryImpl.java
  c41cd0f 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/DefaultHiveAuthorizerFactory.java
  7470e9d 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAccessController.java
  8e4114f 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationPluginException.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationValidator.java
  63046f5 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizer.java
  c10a2ac 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizerImpl.java
  ca95bfc 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeInfo.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizationValidator.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizerFactory.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 08f7fec 
   ql/src/test/queries/clientpositive/authorization_1_sql_std.q PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_role_grant1.q 1a375a5 
   ql/src/test/results/clientpositive/authorization_1_sql_std.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_role_grant1.q.out 496687c 
 
 Diff: https://reviews.apache.org/r/17163/diff/
 
 
 Testing
 ---
 
 .q tests included.
 
 
 Thanks,
 
 Thejas Nair
 




[jira] [Updated] (HIVE-5929) SQL std auth - Access control statement updates

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5929:


Attachment: HIVE-5929.3.patch

 SQL std auth - Access control statement updates
 ---

 Key: HIVE-5929
 URL: https://issues.apache.org/jira/browse/HIVE-5929
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
 Attachments: HIVE-5929.1.patch, HIVE-5929.2.patch, HIVE-5929.3.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6122) Implement show grant on resource

2014-01-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879125#comment-13879125
 ] 

Ashutosh Chauhan commented on HIVE-6122:


Are you concerned about backward-compat? That we show all privileges for user 
with {{ show grant user hive_test_user }} as oppose to only user-level earlier. 
Than in HIVE-5928 a new plugin api is implemented, so we can make backward 
incompatible change on new api, instead of modifying behavior for existing 
syntax of old api. Also, HIVE-5929 has impl of that api.
In other words, only with {{ set 
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
 }}  {{ show grant user hive_test_user }} will show all privileges for user, 
otherwise old behavior.

 Implement show grant on resource
 --

 Key: HIVE-6122
 URL: https://issues.apache.org/jira/browse/HIVE-6122
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6122.1.patch.txt, HIVE-6122.2.patch.txt, 
 HIVE-6122.3.patch.txt


 Currently, hive shows privileges owned by a principal. Reverse API is also 
 needed, which shows all principals for a resource. 
 {noformat}
 show grant user hive_test_user on database default;
 show grant user hive_test_user on table dummy;
 show grant user hive_test_user on all;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17163: HIVE-5929 - SQL std auth - Access control statement updates

2014-01-22 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17163/
---

(Updated Jan. 22, 2014, 8:19 p.m.)


Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-5929
https://issues.apache.org/jira/browse/HIVE-5929


Repository: hive-git


Description
---

Subtask for sql standard based auth, for performing the updates to metastore 
from newly supported access  control statements .


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java bd95161 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 92ed55b 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveUtils.java c65bf28 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/authorization/HiveAuthorizationTaskFactoryImpl.java
 c41cd0f 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/DefaultHiveAuthorizerFactory.java
 7470e9d 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAccessController.java
 8e4114f 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationPluginException.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationValidator.java
 63046f5 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizer.java
 c10a2ac 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizerImpl.java
 ca95bfc 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeInfo.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizationValidator.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizerFactory.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 08f7fec 
  ql/src/test/queries/clientpositive/authorization_1_sql_std.q PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_role_grant1.q 1a375a5 
  ql/src/test/results/clientpositive/authorization_1_sql_std.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_role_grant1.q.out 496687c 

Diff: https://reviews.apache.org/r/17163/diff/


Testing
---

.q tests included.


Thanks,

Thejas Nair



[jira] [Commented] (HIVE-6260) Compress plan when sending via RPC (Tez)

2014-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879159#comment-13879159
 ] 

Hive QA commented on HIVE-6260:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624290/HIVE-6260.1.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 4949 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/992/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/992/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624290

 Compress plan when sending via RPC (Tez)
 

 Key: HIVE-6260
 URL: https://issues.apache.org/jira/browse/HIVE-6260
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6260.1.patch


 When trying to send plan via RPC it's helpful to compress the payload. That 
 way more potential plans can be sent (size limit).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6137) Hive should report that the file/path doesn’t exist when it doesn’t (it now reports SocketTimeoutException)

2014-01-22 Thread Shuaishuai Nie (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879193#comment-13879193
 ] 

Shuaishuai Nie commented on HIVE-6137:
--

Agree with [~thejas]. In this scenario, this fix may produce misleading error 
message. I think the right fix should be finding the location where hive check 
the existence of the external table loaction and pass FileNotFound exception 
from there. In addition, does this problem also exist for external partition? 
If so, we should keep the behavior consistence in both cases.

 Hive should report that the file/path doesn’t exist when it doesn’t (it now 
 reports SocketTimeoutException)
 ---

 Key: HIVE-6137
 URL: https://issues.apache.org/jira/browse/HIVE-6137
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6137.1.patch


 Hive should report that the file/path doesn’t exist when it doesn’t (it now 
 reports SocketTimeoutException):
 Execute a Hive DDL query with a reference to a non-existent blob (such as 
 CREATE EXTERNAL TABLE...) and check Hive logs (stderr):
 FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: 
 java.net.SocketTimeoutException: Read timed out
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 This error message is not intuitive. If a file doesn't exist, Hive should 
 report FileNotFoundException



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6260) Compress plan when sending via RPC (Tez)

2014-01-22 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879206#comment-13879206
 ] 

Gunther Hagleitner commented on HIVE-6260:
--

Failures are unrelated. That seems to be a problem with the build. Tests failed 
on precommit runs of other jiras, also locally happens with or without the 
patch.

 Compress plan when sending via RPC (Tez)
 

 Key: HIVE-6260
 URL: https://issues.apache.org/jira/browse/HIVE-6260
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6260.1.patch


 When trying to send plan via RPC it's helpful to compress the payload. That 
 way more potential plans can be sent (size limit).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6261) Update metadata.q.out file for tez (after change to .q file)

2014-01-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879269#comment-13879269
 ] 

Hive QA commented on HIVE-6261:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624293/HIVE-6261.1.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 4949 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/993/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/993/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624293

 Update metadata.q.out file for tez (after change to .q file)
 

 Key: HIVE-6261
 URL: https://issues.apache.org/jira/browse/HIVE-6261
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6261.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 16951: HIVE-6109: Support customized location for EXTERNAL tables created by Dynamic Partitioning

2014-01-22 Thread Sushanth Sowmyan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16951/#review32561
---


Hi, Sorry for the delay, I thought I'd published this review over the weekend, 
but reviewboard was unresponsive, and it looks like it didn't save.


hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
https://reviews.apache.org/r/16951/#comment61449

_DYN is already defined in FosterStorageHandler, needs to have one place 
where it's defined. I'm okay with it being defined here if the 
FosterStorageHandler constant is removed and references to that are changed to 
reference this.



hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
https://reviews.apache.org/r/16951/#comment61451

whitespace errors - git refers to a bunch of these through the patch when 
we try to apply, please correct for final patch upload.



hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
https://reviews.apache.org/r/16951/#comment61453

A bit about code readability - if we add a special case, then it makes 
sense to add the special case as an else, rather than as an if - that way, the 
default behaviour is visible first, and then the special case - please swap 
this around so that this is a if (!customDynamicLocationUsed) structure.



hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
https://reviews.apache.org/r/16951/#comment61452

This is now significant amount of code repetition from line 720-741 above, 
please see if we can refactor this into a separate method.



hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/OutputJobInfo.java
https://reviews.apache.org/r/16951/#comment61455

This becomes the primary API point with this change, wherein, a user that 
is using HCatOutputFormat will generate an OutputJobInfo, and then call 
setCustomDynamicLocation on it. This is fine for M/R users of HCat, but is 
something that will wind up having to be implemented for each M/R user. It 
might have been better to define a constant in HCatConstants, say 
hcat.dynamic.partitioning.custom.pattern, and to use that as a JobInfo 
parameter. That makes it easier for other tools to integrate with this feature. 
For example, with your patch, we still do not support the ability for the 
HCatStorer from pig to be able to write to custom dynamic partitions, while we 
do want to keep feature parity where possible between HCatOutputFormat and 
HCatStorer.

In fact, as a design goal for HCat, we're trying to move away from 
letting(requiring) users explicitly muck around with OutputJobInfo and 
InputJobInfo, and stick to static calls to HCatInputFormat/HCatOutputFormat.

I would like to see this call be something the HCatOutputFormat 
automatically calls if a jobConf parameter(as above) is set. That way, we can 
solve pig compatibility as well easily.



- Sushanth Sowmyan


On Jan. 16, 2014, 12:09 p.m., Satish Mittal wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/16951/
 ---
 
 (Updated Jan. 16, 2014, 12:09 p.m.)
 
 
 Review request for hive and Sushanth Sowmyan.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 - Attaching the patch that implements the functionality to support custom 
 location for external tables in dynamic partitioning.
 
 
 Diffs
 -
 
   
 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
  a5ae1be 
   
 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
  288b7a3 
   
 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatFileUtil.java
  PRE-CREATION 
   
 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/OutputJobInfo.java
  b63bdc2 
   
 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/HCatMapReduceTest.java
  77bdb9d 
   
 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatDynamicPartitioned.java
  d8b69c2 
   
 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatExternalDynamicPartitioned.java
  36c7945 
 
 Diff: https://reviews.apache.org/r/16951/diff/
 
 
 Testing
 ---
 
 - Added unit test.
 - Tested the functionality through a sample MR program that uses 
 HCatOutputFormat interface configured with the new custom dynamic location.
 
 
 Thanks,
 
 Satish Mittal
 




[jira] [Commented] (HIVE-6245) HS2 creates DBs/Tables with wrong ownership when HMS setugi is true

2014-01-22 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879282#comment-13879282
 ] 

Chaoyu Tang commented on HIVE-6245:
---

[~thejas] Could you take a look at this JIRA and comment? I noticed that the 
change in HIVE-4356 (remove duplicate impersonation parameters for hiveserver2) 
made HS2 doAs (impersonation) only work with Kerberos env. Thanks

 HS2 creates DBs/Tables with wrong ownership when HMS setugi is true
 ---

 Key: HIVE-6245
 URL: https://issues.apache.org/jira/browse/HIVE-6245
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
 Attachments: HIVE-6245.patch


 The case with following settings is valid but does not work correctly in 
 current HS2:
 ==
 hive.server2.authentication=NONE (or LDAP)
 hive.server2.enable.doAs= true
 hive.metastore.sasl.enabled=false
 hive.metastore.execute.setugi=true
 ==
 Ideally, HS2 is able to impersonate the logged in user (from Beeline, or JDBC 
 application) and create DBs/Tables with user's ownership.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6265) dedup Metastore data structures or at least protocol

2014-01-22 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-6265:
--

 Summary: dedup Metastore data structures or at least protocol
 Key: HIVE-6265
 URL: https://issues.apache.org/jira/browse/HIVE-6265
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin


Metastore currently stores SD per partition, and column schema/serde/... per SD.
Most of the time all the partitions have the same setup in a table, the only 
different things in SD/CD/... being the location. In such cases, we don't need 
to store these separately and send them to client when many partitions are 
retrieved for a large table. While storage changes may be too complex wrt 
backward compat, as well as with DataNucleus being in the picture and 
controlling the db schema/persistence, at least we can avoid sending lots of 
duplicate data to the client on the network; thrift protocol can be modified to 
omit duplicate data in a backward compatible manner.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6265) dedup Metastore data structures or at least protocol

2014-01-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6265:
---

Component/s: Metastore

 dedup Metastore data structures or at least protocol
 

 Key: HIVE-6265
 URL: https://issues.apache.org/jira/browse/HIVE-6265
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sergey Shelukhin

 Metastore currently stores SD per partition, and column schema/serde/... per 
 SD.
 Most of the time all the partitions have the same setup in a table, the only 
 different things in SD/CD/... being the location. In such cases, we don't 
 need to store these separately and send them to client when many partitions 
 are retrieved for a large table. While storage changes may be too complex wrt 
 backward compat, as well as with DataNucleus being in the picture and 
 controlling the db schema/persistence, at least we can avoid sending lots of 
 duplicate data to the client on the network; thrift protocol can be modified 
 to omit duplicate data in a backward compatible manner.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17163: HIVE-5929 - SQL std auth - Access control statement updates

2014-01-22 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17163/
---

(Updated Jan. 22, 2014, 10:53 p.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

also changeing the pluginexception to inherit from HiveException. On second 
thoughts, the change does not really expose HiveException api's to the user.


Bugs: HIVE-5929
https://issues.apache.org/jira/browse/HIVE-5929


Repository: hive-git


Description
---

Subtask for sql standard based auth, for performing the updates to metastore 
from newly supported access  control statements .


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java bd95161 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 92ed55b 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveUtils.java c65bf28 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/authorization/HiveAuthorizationTaskFactoryImpl.java
 c41cd0f 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/DefaultHiveAuthorizerFactory.java
 7470e9d 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAccessController.java
 8e4114f 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationPluginException.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationValidator.java
 63046f5 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizer.java
 c10a2ac 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizerImpl.java
 ca95bfc 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeInfo.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizationValidator.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizerFactory.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 08f7fec 
  ql/src/test/queries/clientpositive/authorization_1_sql_std.q PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_role_grant1.q 1a375a5 
  ql/src/test/results/clientpositive/authorization_1_sql_std.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_role_grant1.q.out 496687c 

Diff: https://reviews.apache.org/r/17163/diff/


Testing
---

.q tests included.


Thanks,

Thejas Nair



[jira] [Updated] (HIVE-5929) SQL std auth - Access control statement updates

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5929:


Attachment: HIVE-5929.4.patch

 SQL std auth - Access control statement updates
 ---

 Key: HIVE-5929
 URL: https://issues.apache.org/jira/browse/HIVE-5929
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
 Attachments: HIVE-5929.1.patch, HIVE-5929.2.patch, HIVE-5929.3.patch, 
 HIVE-5929.4.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6251) Add ability to specify delimiter in HCatalog Java API to create tables - HCatCreateTableDesc

2014-01-22 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6251:
-

Description: 
Per 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/TruncateTablerow_format,
 the following is supported when creating a table.

{code}
  : DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS 
TERMINATED BY char]
[MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
[NULL DEFINED AS char] (Note: Only available starting with Hive 0.13)
  | SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, 
property_name=property_value, ...)]
{code}

Need to add support for specifying 4 delimiters plus escape and NULL in 
HCatCreateTableDesc. create(String dbName, String tableName, 
ListHCatFieldSchema columns) API.

  was:
Per 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/TruncateTablerow_format,
 the following is supported when creating a table.

{code}
  : DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS 
TERMINATED BY char]
[MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
[NULL DEFINED AS char] (Note: Only available starting with Hive 0.13)
  | SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, 
property_name=property_value, ...)]
{code}

Need to add support for specifying delimiters in HCatCreateTableDesc. 
create(String dbName, String tableName, ListHCatFieldSchema columns) API.


 Add ability to specify delimiter in HCatalog Java API to create tables - 
 HCatCreateTableDesc
 

 Key: HIVE-6251
 URL: https://issues.apache.org/jira/browse/HIVE-6251
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 Per 
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/TruncateTablerow_format,
  the following is supported when creating a table.
 {code}
   : DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS 
 TERMINATED BY char]
 [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
 [NULL DEFINED AS char] (Note: Only available starting with Hive 0.13)
   | SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, 
 property_name=property_value, ...)]
 {code}
 Need to add support for specifying 4 delimiters plus escape and NULL in 
 HCatCreateTableDesc. create(String dbName, String tableName, 
 ListHCatFieldSchema columns) API.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5929) SQL std auth - Access control statement updates

2014-01-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879348#comment-13879348
 ] 

Ashutosh Chauhan commented on HIVE-5929:


+1 pending tests

 SQL std auth - Access control statement updates
 ---

 Key: HIVE-5929
 URL: https://issues.apache.org/jira/browse/HIVE-5929
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
 Attachments: HIVE-5929.1.patch, HIVE-5929.2.patch, HIVE-5929.3.patch, 
 HIVE-5929.4.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17163: HIVE-5929 - SQL std auth - Access control statement updates

2014-01-22 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17163/#review32579
---

Ship it!


Ship It!

- Ashutosh Chauhan


On Jan. 22, 2014, 10:53 p.m., Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17163/
 ---
 
 (Updated Jan. 22, 2014, 10:53 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Bugs: HIVE-5929
 https://issues.apache.org/jira/browse/HIVE-5929
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bd95161 
   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 92ed55b 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveUtils.java c65bf28 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/authorization/HiveAuthorizationTaskFactoryImpl.java
  c41cd0f 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/DefaultHiveAuthorizerFactory.java
  7470e9d 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAccessController.java
  8e4114f 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationPluginException.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationValidator.java
  63046f5 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizer.java
  c10a2ac 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizerImpl.java
  ca95bfc 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeInfo.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizationValidator.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizerFactory.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 08f7fec 
   ql/src/test/queries/clientpositive/authorization_1_sql_std.q PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_role_grant1.q 1a375a5 
   ql/src/test/results/clientpositive/authorization_1_sql_std.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_role_grant1.q.out 496687c 
 
 Diff: https://reviews.apache.org/r/17163/diff/
 
 
 Testing
 ---
 
 .q tests included.
 
 
 Thanks,
 
 Thejas Nair
 




[jira] [Commented] (HIVE-6261) Update metadata.q.out file for tez (after change to .q file)

2014-01-22 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879356#comment-13879356
 ] 

Gunther Hagleitner commented on HIVE-6261:
--

Failures are unrelated. The hadoop-1 build isn't even touching the file changed 
in this patch.

 Update metadata.q.out file for tez (after change to .q file)
 

 Key: HIVE-6261
 URL: https://issues.apache.org/jira/browse/HIVE-6261
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6261.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6266) CTAS Properties Not Passed

2014-01-22 Thread Jesse Anderson (JIRA)
Jesse Anderson created HIVE-6266:


 Summary: CTAS Properties Not Passed
 Key: HIVE-6266
 URL: https://issues.apache.org/jira/browse/HIVE-6266
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.8.0
Reporter: Jesse Anderson


I am doing a CTAS and using a Custom SerDe property to change output format 
settings. Here is the query I am doing:

{code}
CREATE TABLE calldataformat
ROW FORMAT SERDE
  'com.loudacre.hiveserdebonus.solution.CallDetailSerDe'
WITH SERDEPROPERTIES
  ( fixedwidth.regex = ^(.{36})(.{17})(.{17})(.{10})(.{10})(.{10})$, 
fixedwidth.dateformat = -DDD kk:mm:ss )
LOCATION
  '/loudacre/calldataformat'
AS
SELECT call_id,
  call_begin,
  call_end,
  status,
  from_phone,
  to_phone 
FROM calldata
WHERE status  'SUCCESS';
{code}

The fixedwidth.regex and fixedwidth.dateformat properties are never passed in 
via the Property object. I added some logging output to the initialize method 
to log every property that comes in. This is the logging output:
{noformat}
2014-01-22 14:53:35,110 INFO CallDetailSerDe: Key:name 
Value:default.calldataformat
2014-01-22 14:53:35,110 INFO CallDetailSerDe: Key:columns 
Value:_col0,_col1,_col2,_col3,_col4,_col5
2014-01-22 14:53:35,110 INFO CallDetailSerDe: Key:serialization.format Value:1
2014-01-22 14:53:35,110 INFO CallDetailSerDe: Key:columns.types 
Value:string:timestamp:timestamp:string:string:string
{noformat}

The work around is to do a 2-step process instead of a CTAS. You need to create 
the table first and then do a INSERT INTO. This way, the properties are passed 
in and all of the formatting is correct.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17162: HIVE-6157 Fetching column stats slow

2014-01-22 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17162/
---

(Updated Jan. 23, 2014, 12:03 a.m.)


Review request for hive and Gunther Hagleitner.


Changes
---

fix the tests, some minor changes. JIRA will be reindexing for an hour+, I will 
submit the patch there later


Repository: hive-git


Description
---

see jira


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1b01238 
  metastore/if/hive_metastore.thrift e4e816d 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
58f9957 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
ed05790 
  metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
4288781 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
16f43e9 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 794fadd 
  metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java f5ea2ef 
  metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
PRE-CREATION 
  
metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 003dc9c 
  
metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 6dd0852 
  metastore/src/test/org/apache/hadoop/hive/metastore/VerifyingObjectStore.java 
c683fc9 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java d32deea 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java 608bef2 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java b815ea2 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 384b49e 
  ql/src/test/results/clientpositive/metadataonly1.q.out 3500fd2 

Diff: https://reviews.apache.org/r/17162/diff/


Testing
---


Thanks,

Sergey Shelukhin



Review Request 17217: HIVE-???? MSCK can be slow when adding partitions

2014-01-22 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17217/
---

Review request for hive, Ashutosh Chauhan and Gunther Hagleitner.


Repository: hive-git


Description
---

JIRA is reindexing so I will file the JIRA later.

Trivial patch. Use bulk method.
If bulk method fails try one by one (via existing code), it's msck after all so 
we expect the worst.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 92ed55b 

Diff: https://reviews.apache.org/r/17217/diff/


Testing
---


Thanks,

Sergey Shelukhin



[jira] [Commented] (HIVE-6262) Remove unnecessary copies of schema + table desc from serialized plan

2014-01-22 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879369#comment-13879369
 ] 

Gunther Hagleitner commented on HIVE-6262:
--

Tests have successfully run: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/994/testReport/ (the 
7 failures are unrelated). Unfortunately jira was down when the tests completed 
(so no auto update)

 Remove unnecessary copies of schema + table desc from serialized plan
 -

 Key: HIVE-6262
 URL: https://issues.apache.org/jira/browse/HIVE-6262
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6262.1.patch


 Currently for a partitioned table the following are true:
 - for each partitiondesc we send a copy of the corresponding tabledesc
 - for each partitiondesc we send two copies of the schema (in different 
 formats).
 Obviously we need to send different schemas if they are required by schema 
 evolution, but in our case we'll always end up with multiple copies.
 The effect can be dramatic. The reductions by removing those on partitioned 
 tables easily be can be 8-10x in size. Plans themselves can be 10s to 100s of 
 mb (even with kryo). The size difference also plays out in every task on the 
 cluster we run.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6263) Avoid sending input files multiple times on Tez

2014-01-22 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6263:
-

Status: Open  (was: Patch Available)

 Avoid sending input files multiple times on Tez
 ---

 Key: HIVE-6263
 URL: https://issues.apache.org/jira/browse/HIVE-6263
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6263.1.patch


 Input paths can be recontructed from the plan. No need to send them in the 
 job conf as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6263) Avoid sending input files multiple times on Tez

2014-01-22 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6263:
-

Attachment: HIVE-6263.2.patch

Re-uploading to trigger precommit.

 Avoid sending input files multiple times on Tez
 ---

 Key: HIVE-6263
 URL: https://issues.apache.org/jira/browse/HIVE-6263
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6263.1.patch, HIVE-6263.2.patch


 Input paths can be recontructed from the plan. No need to send them in the 
 job conf as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6267) Explain explain

2014-01-22 Thread Gunther Hagleitner (JIRA)
Gunther Hagleitner created HIVE-6267:


 Summary: Explain explain
 Key: HIVE-6267
 URL: https://issues.apache.org/jira/browse/HIVE-6267
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6263.2.patch

I've gotten feedback over time saying that it's very difficult to grok our 
explain command. There's supposedly a lot of information that mainly matters to 
developers or the testing framework. Comparing it to other major DBs it does 
seem like we're packing way more into explain than other folks.

I've gone through the explain checking, what could be done to improve 
readability. Here's a list of things I've found:

- AST (unreadable in it's lisp syntax, not really required for end users)
- Vectorization (enough to display once per task and only when true)
- Expressions representation is very lengthy, could be much more compact
- if not exists on DDL (enough to display only on true, or maybe not at all)
- bucketing info (enough if displayed only if table is actually bucketed)
- external flag (show only if external)
- GlobalTableId (don't need in plain explain, maybe in extended)
- Position of big table (already clear from plan)
- Stats always (Most DBs mostly only show stats in explain, that gives a sense 
of what the planer thinks will happen)
- skew join (only if true should be enough)
- limit doesn't show the actual limit
- Alias - Map Operator tree - alias is duplicated in TableScan operator
- tag is only useful at runtime (move to explain extended)
- Some names are camel case or abbreviated, clearer if full name
- Tez is missing vertex map (aka edges)
- explain formatted (json) is broken right now (swallows some information)

Since changing explain results in many golden file updates, i'd like to take a 
stab at all of these at once.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6267) Explain explain

2014-01-22 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6267:
-

Attachment: HIVE-6263.2.patch

Here's a draft. Still have to run and evaluate golden files.

 Explain explain
 ---

 Key: HIVE-6267
 URL: https://issues.apache.org/jira/browse/HIVE-6267
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6263.2.patch


 I've gotten feedback over time saying that it's very difficult to grok our 
 explain command. There's supposedly a lot of information that mainly matters 
 to developers or the testing framework. Comparing it to other major DBs it 
 does seem like we're packing way more into explain than other folks.
 I've gone through the explain checking, what could be done to improve 
 readability. Here's a list of things I've found:
 - AST (unreadable in it's lisp syntax, not really required for end users)
 - Vectorization (enough to display once per task and only when true)
 - Expressions representation is very lengthy, could be much more compact
 - if not exists on DDL (enough to display only on true, or maybe not at all)
 - bucketing info (enough if displayed only if table is actually bucketed)
 - external flag (show only if external)
 - GlobalTableId (don't need in plain explain, maybe in extended)
 - Position of big table (already clear from plan)
 - Stats always (Most DBs mostly only show stats in explain, that gives a 
 sense of what the planer thinks will happen)
 - skew join (only if true should be enough)
 - limit doesn't show the actual limit
 - Alias - Map Operator tree - alias is duplicated in TableScan operator
 - tag is only useful at runtime (move to explain extended)
 - Some names are camel case or abbreviated, clearer if full name
 - Tez is missing vertex map (aka edges)
 - explain formatted (json) is broken right now (swallows some information)
 Since changing explain results in many golden file updates, i'd like to take 
 a stab at all of these at once.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6267) Explain explain

2014-01-22 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6267:
-

Attachment: (was: HIVE-6263.2.patch)

 Explain explain
 ---

 Key: HIVE-6267
 URL: https://issues.apache.org/jira/browse/HIVE-6267
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6267.1.partial


 I've gotten feedback over time saying that it's very difficult to grok our 
 explain command. There's supposedly a lot of information that mainly matters 
 to developers or the testing framework. Comparing it to other major DBs it 
 does seem like we're packing way more into explain than other folks.
 I've gone through the explain checking, what could be done to improve 
 readability. Here's a list of things I've found:
 - AST (unreadable in it's lisp syntax, not really required for end users)
 - Vectorization (enough to display once per task and only when true)
 - Expressions representation is very lengthy, could be much more compact
 - if not exists on DDL (enough to display only on true, or maybe not at all)
 - bucketing info (enough if displayed only if table is actually bucketed)
 - external flag (show only if external)
 - GlobalTableId (don't need in plain explain, maybe in extended)
 - Position of big table (already clear from plan)
 - Stats always (Most DBs mostly only show stats in explain, that gives a 
 sense of what the planer thinks will happen)
 - skew join (only if true should be enough)
 - limit doesn't show the actual limit
 - Alias - Map Operator tree - alias is duplicated in TableScan operator
 - tag is only useful at runtime (move to explain extended)
 - Some names are camel case or abbreviated, clearer if full name
 - Tez is missing vertex map (aka edges)
 - explain formatted (json) is broken right now (swallows some information)
 Since changing explain results in many golden file updates, i'd like to take 
 a stab at all of these at once.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6267) Explain explain

2014-01-22 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6267:
-

Attachment: HIVE-6267.1.partial

Oops. Wrong file. 6267.1 is the right one.

 Explain explain
 ---

 Key: HIVE-6267
 URL: https://issues.apache.org/jira/browse/HIVE-6267
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6267.1.partial


 I've gotten feedback over time saying that it's very difficult to grok our 
 explain command. There's supposedly a lot of information that mainly matters 
 to developers or the testing framework. Comparing it to other major DBs it 
 does seem like we're packing way more into explain than other folks.
 I've gone through the explain checking, what could be done to improve 
 readability. Here's a list of things I've found:
 - AST (unreadable in it's lisp syntax, not really required for end users)
 - Vectorization (enough to display once per task and only when true)
 - Expressions representation is very lengthy, could be much more compact
 - if not exists on DDL (enough to display only on true, or maybe not at all)
 - bucketing info (enough if displayed only if table is actually bucketed)
 - external flag (show only if external)
 - GlobalTableId (don't need in plain explain, maybe in extended)
 - Position of big table (already clear from plan)
 - Stats always (Most DBs mostly only show stats in explain, that gives a 
 sense of what the planer thinks will happen)
 - skew join (only if true should be enough)
 - limit doesn't show the actual limit
 - Alias - Map Operator tree - alias is duplicated in TableScan operator
 - tag is only useful at runtime (move to explain extended)
 - Some names are camel case or abbreviated, clearer if full name
 - Tez is missing vertex map (aka edges)
 - explain formatted (json) is broken right now (swallows some information)
 Since changing explain results in many golden file updates, i'd like to take 
 a stab at all of these at once.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5929) SQL std auth - Access control statement updates

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5929:


Assignee: Thejas M Nair
  Status: Open  (was: Patch Available)

 SQL std auth - Access control statement updates
 ---

 Key: HIVE-5929
 URL: https://issues.apache.org/jira/browse/HIVE-5929
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-5929.1.patch, HIVE-5929.2.patch, HIVE-5929.3.patch, 
 HIVE-5929.4.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5929) SQL std auth - Access control statement updates

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5929:


Status: Patch Available  (was: Open)

 SQL std auth - Access control statement updates
 ---

 Key: HIVE-5929
 URL: https://issues.apache.org/jira/browse/HIVE-5929
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-5929.1.patch, HIVE-5929.2.patch, HIVE-5929.3.patch, 
 HIVE-5929.4.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6263) Avoid sending input files multiple times on Tez

2014-01-22 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6263:
-

Status: Patch Available  (was: Open)

 Avoid sending input files multiple times on Tez
 ---

 Key: HIVE-6263
 URL: https://issues.apache.org/jira/browse/HIVE-6263
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6263.1.patch, HIVE-6263.2.patch


 Input paths can be recontructed from the plan. No need to send them in the 
 job conf as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5446) Hive can CREATE an external table but not SELECT from it when file path have spaces

2014-01-22 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879391#comment-13879391
 ] 

Jason Dere commented on HIVE-5446:
--

Hey, is the commit for this fix missing file 
data/files/ext_test_space/folder+with space/data.txt? I don't see it on git.  I 
would have thought that any of the pre commit tests run after this commit would 
have failed on this test since the required data file is missing.

 Hive can CREATE an external table but not SELECT from it when file path have 
 spaces
 ---

 Key: HIVE-5446
 URL: https://issues.apache.org/jira/browse/HIVE-5446
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.13.0

 Attachments: HIVE-5446.1.patch, HIVE-5446.2.patch, HIVE-5446.3.patch, 
 HIVE-5446.4.patch, HIVE-5446.5.patch, HIVE-5446.7.patch


 Create external table table1 (age int, 
 gender string, totBil float, 
 dirBill float, alkphos int,
 sgpt int, sgot int, totProt float, 
 aLB float, aG float, sel int) 
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ','
 STORED AS TEXTFILE
 LOCATION 'hdfs://namenodehost:9000/hive newtable';
 select * from table1;
 return nothing even there is file in the target folder



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5446) Hive can CREATE an external table but not SELECT from it when file path have spaces

2014-01-22 Thread Shuaishuai Nie (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879396#comment-13879396
 ] 

Shuaishuai Nie commented on HIVE-5446:
--

Hi [~xuefuz], seems like the test file is missing. Can you help commit? Thanks

 Hive can CREATE an external table but not SELECT from it when file path have 
 spaces
 ---

 Key: HIVE-5446
 URL: https://issues.apache.org/jira/browse/HIVE-5446
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.13.0

 Attachments: HIVE-5446.1.patch, HIVE-5446.2.patch, HIVE-5446.3.patch, 
 HIVE-5446.4.patch, HIVE-5446.5.patch, HIVE-5446.7.patch


 Create external table table1 (age int, 
 gender string, totBil float, 
 dirBill float, alkphos int,
 sgpt int, sgot int, totProt float, 
 aLB float, aG float, sel int) 
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ','
 STORED AS TEXTFILE
 LOCATION 'hdfs://namenodehost:9000/hive newtable';
 select * from table1;
 return nothing even there is file in the target folder



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17217: HIVE-???? MSCK can be slow when adding partitions

2014-01-22 Thread Mohammad Islam

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17217/#review32592
---



ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
https://reviews.apache.org/r/17217/#comment61495

Should we clear repairOutput list before the loop? The reason: there is a 
very small possibility that 'repairOutput' is partially populated in the caller 
method before the exception thrown.
 



ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
https://reviews.apache.org/r/17217/#comment61496

Is it the same 'db' object that might be populated in the caller method?


- Mohammad Islam


On Jan. 23, 2014, 12:49 a.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17217/
 ---
 
 (Updated Jan. 23, 2014, 12:49 a.m.)
 
 
 Review request for hive, Ashutosh Chauhan and Gunther Hagleitner.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 JIRA is reindexing so I will file the JIRA later.
 
 Trivial patch. Use bulk method.
 If bulk method fails try one by one (via existing code), it's msck after all 
 so we expect the worst.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 92ed55b 
 
 Diff: https://reviews.apache.org/r/17217/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




[jira] [Updated] (HIVE-5929) SQL std auth - Access control statement updates

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5929:


Attachment: HIVE-5929.4.patch

re uploading file for pre-commit tests.

 SQL std auth - Access control statement updates
 ---

 Key: HIVE-5929
 URL: https://issues.apache.org/jira/browse/HIVE-5929
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-5929.1.patch, HIVE-5929.2.patch, HIVE-5929.3.patch, 
 HIVE-5929.4.patch, HIVE-5929.4.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5929) SQL std auth - Access control statement updates

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5929:


Status: Open  (was: Patch Available)

 SQL std auth - Access control statement updates
 ---

 Key: HIVE-5929
 URL: https://issues.apache.org/jira/browse/HIVE-5929
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-5929.1.patch, HIVE-5929.2.patch, HIVE-5929.3.patch, 
 HIVE-5929.4.patch, HIVE-5929.4.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5929) SQL std auth - Access control statement updates

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5929:


Status: Patch Available  (was: Open)

 SQL std auth - Access control statement updates
 ---

 Key: HIVE-5929
 URL: https://issues.apache.org/jira/browse/HIVE-5929
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-5929.1.patch, HIVE-5929.2.patch, HIVE-5929.3.patch, 
 HIVE-5929.4.patch, HIVE-5929.4.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Subtask for sql standard based auth, for performing the updates to metastore 
 from newly supported access  control statements .



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6258) sql std auth - disallow cycles between roles

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6258:


Attachment: HIVE-6258.1.patch

 sql std auth - disallow cycles between roles
 

 Key: HIVE-6258
 URL: https://issues.apache.org/jira/browse/HIVE-6258
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6258.1.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 It should not be possible to have cycles in role relationships.
 If a grant role statement would end up adding such a cycle, it should result 
 in an error.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6258) sql std auth - disallow cycles between roles

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6258:


Attachment: (was: HIVE-6258.1.patch)

 sql std auth - disallow cycles between roles
 

 Key: HIVE-6258
 URL: https://issues.apache.org/jira/browse/HIVE-6258
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6258.1.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 It should not be possible to have cycles in role relationships.
 If a grant role statement would end up adding such a cycle, it should result 
 in an error.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6258) sql std auth - disallow cycles between roles

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6258:


Attachment: HIVE-6258.1.patch

 sql std auth - disallow cycles between roles
 

 Key: HIVE-6258
 URL: https://issues.apache.org/jira/browse/HIVE-6258
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6258.1.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 It should not be possible to have cycles in role relationships.
 If a grant role statement would end up adding such a cycle, it should result 
 in an error.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6258) sql std auth - disallow cycles between roles

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6258:


Attachment: HIVE-6258.1.patch

 sql std auth - disallow cycles between roles
 

 Key: HIVE-6258
 URL: https://issues.apache.org/jira/browse/HIVE-6258
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6258.1.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 It should not be possible to have cycles in role relationships.
 If a grant role statement would end up adding such a cycle, it should result 
 in an error.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6258) sql std auth - disallow cycles between roles

2014-01-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6258:


Attachment: (was: HIVE-6258.1.patch)

 sql std auth - disallow cycles between roles
 

 Key: HIVE-6258
 URL: https://issues.apache.org/jira/browse/HIVE-6258
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6258.1.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 It should not be possible to have cycles in role relationships.
 If a grant role statement would end up adding such a cycle, it should result 
 in an error.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6043) Document incompatible changes in Hive 0.12 and trunk

2014-01-22 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-6043:
-

Description: 
We need to document incompatible changes. For example

* HIVE-5372 changed object inspector hierarchy breaking most if not all custom 
serdes
* HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom 
serdes (fixed by HIVE-5380)
* Hive 0.12 (HIVE-4825) separates MapredWork into MapWork and ReduceWork which 
is used by Serdes
* HIVE-5411 serializes expressions with Kryo which are used by custom serdes
* HIVE-4827 removed the flag of hive.optimize.mapjoin.mapreduce (This flag 
was introduced in Hive 0.11 by HIVE-3952).


  was:
We need to document incompatible changes. For example

* HIVE-5372 changed object inspector hierarchy breaking most if not all custom 
serdes
* HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom 
serdes (fixed by HIVE-5380)
* Hive 0.12 separates MapredWork into MapWork and ReduceWork which is used by 
Serdes
* HIVE-5411 serializes expressions with Kryo which are used by custom serdes
* HIVE-4827 removed the flag of hive.optimize.mapjoin.mapreduce (This flag 
was introduced in Hive 0.11 by HIVE-3952).



 Document incompatible changes in Hive 0.12 and trunk
 

 Key: HIVE-6043
 URL: https://issues.apache.org/jira/browse/HIVE-6043
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Priority: Blocker

 We need to document incompatible changes. For example
 * HIVE-5372 changed object inspector hierarchy breaking most if not all 
 custom serdes
 * HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom 
 serdes (fixed by HIVE-5380)
 * Hive 0.12 (HIVE-4825) separates MapredWork into MapWork and ReduceWork 
 which is used by Serdes
 * HIVE-5411 serializes expressions with Kryo which are used by custom serdes
 * HIVE-4827 removed the flag of hive.optimize.mapjoin.mapreduce (This flag 
 was introduced in Hive 0.11 by HIVE-3952).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5446) Hive can CREATE an external table but not SELECT from it when file path have spaces

2014-01-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879446#comment-13879446
 ] 

Xuefu Zhang commented on HIVE-5446:
---

Guys,

Please help me understand. If the data file is missing, then I would expect 
that the tests relying on it would fail. Let's fix this once for all. 
[~shuainie] Could you please check why the new tests didn't fail? If we can 
confirm that the data file is only the problem, I can commit it.

 Hive can CREATE an external table but not SELECT from it when file path have 
 spaces
 ---

 Key: HIVE-5446
 URL: https://issues.apache.org/jira/browse/HIVE-5446
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.13.0

 Attachments: HIVE-5446.1.patch, HIVE-5446.2.patch, HIVE-5446.3.patch, 
 HIVE-5446.4.patch, HIVE-5446.5.patch, HIVE-5446.7.patch


 Create external table table1 (age int, 
 gender string, totBil float, 
 dirBill float, alkphos int,
 sgpt int, sgot int, totProt float, 
 aLB float, aG float, sel int) 
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ','
 STORED AS TEXTFILE
 LOCATION 'hdfs://namenodehost:9000/hive newtable';
 select * from table1;
 return nothing even there is file in the target folder



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6229) Stats are missing sometimes (regression from HIVE-5936)

2014-01-22 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879480#comment-13879480
 ] 

Navis commented on HIVE-6229:
-

hive.stats.key.prefix is the key for JobConf instance used only for partial 
scan command. I think the name of config is a little misleading. (It's not 
related with other two similar-named configurations)

 Stats are missing sometimes (regression from HIVE-5936)
 ---

 Key: HIVE-6229
 URL: https://issues.apache.org/jira/browse/HIVE-6229
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Navis
Assignee: Navis
 Fix For: 0.13.0

 Attachments: HIVE-6229.1.patch.txt, HIVE-6229.2.patch.txt


 if prefix length is smaller than hive.stats.key.prefix.max.length but length 
 of prefix + postfix is bigger than that, stats are missed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17162: HIVE-6157 Fetching column stats slow

2014-01-22 Thread Gunther Hagleitner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17162/#review32597
---



metastore/if/hive_metastore.thrift
https://reviews.apache.org/r/17162/#comment61503

nit. some ws probs.



metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java
https://reviews.apache.org/r/17162/#comment61510

Thanks for hiding it. I appreciate that :-P



ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java
https://reviews.apache.org/r/17162/#comment61505

Are you planning on doing this in this patch? If not I'd rather see a 
follow up jira and the jira number here rather than TODO



ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java
https://reviews.apache.org/r/17162/#comment61506

Ditto. And lol.



ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
https://reviews.apache.org/r/17162/#comment61509

Remove comment? Seems like a minor thing.



ql/src/test/results/clientpositive/metadataonly1.q.out
https://reviews.apache.org/r/17162/#comment61511

why has this changed?


- Gunther Hagleitner


On Jan. 23, 2014, 12:03 a.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17162/
 ---
 
 (Updated Jan. 23, 2014, 12:03 a.m.)
 
 
 Review request for hive and Gunther Hagleitner.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira
 
 
 Diffs
 -
 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
  1b01238 
   metastore/if/hive_metastore.thrift e4e816d 
   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 58f9957 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
 ed05790 
   metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
 4288781 
   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
 16f43e9 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 794fadd 
   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java f5ea2ef 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
 PRE-CREATION 
   
 metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
  003dc9c 
   
 metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
  6dd0852 
   
 metastore/src/test/org/apache/hadoop/hive/metastore/VerifyingObjectStore.java 
 c683fc9 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java d32deea 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java 608bef2 
   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java b815ea2 
   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 384b49e 
   ql/src/test/results/clientpositive/metadataonly1.q.out 3500fd2 
 
 Diff: https://reviews.apache.org/r/17162/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




Re: Review Request 17162: HIVE-6157 Fetching column stats slow

2014-01-22 Thread Gunther Hagleitner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17162/#review32599
---


With this can we set the default of: hive.stats.fetch.column.stats to true?
Can you please run TestMiniTezCliDriver (-Phadoop-2)? There might be some 
changes with this. (Precommit only runs hadoop-1)

- Gunther Hagleitner


On Jan. 23, 2014, 12:03 a.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17162/
 ---
 
 (Updated Jan. 23, 2014, 12:03 a.m.)
 
 
 Review request for hive and Gunther Hagleitner.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira
 
 
 Diffs
 -
 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
  1b01238 
   metastore/if/hive_metastore.thrift e4e816d 
   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 58f9957 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
 ed05790 
   metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
 4288781 
   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
 16f43e9 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 794fadd 
   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java f5ea2ef 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
 PRE-CREATION 
   
 metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
  003dc9c 
   
 metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
  6dd0852 
   
 metastore/src/test/org/apache/hadoop/hive/metastore/VerifyingObjectStore.java 
 c683fc9 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java d32deea 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java 608bef2 
   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java b815ea2 
   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 384b49e 
   ql/src/test/results/clientpositive/metadataonly1.q.out 3500fd2 
 
 Diff: https://reviews.apache.org/r/17162/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




[jira] [Commented] (HIVE-6157) Fetching column stats slower than the 101 during rush hour

2014-01-22 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879484#comment-13879484
 ] 

Gunther Hagleitner commented on HIVE-6157:
--

Patch looks good. Some questions/comments on RB. What about the test failures? 
They seem relevant.

 Fetching column stats slower than the 101 during rush hour
 --

 Key: HIVE-6157
 URL: https://issues.apache.org/jira/browse/HIVE-6157
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Sergey Shelukhin
 Attachments: HIVE-6157.01.patch, HIVE-6157.01.patch, 
 HIVE-6157.nogen.patch, HIVE-6157.prelim.patch


 hive.stats.fetch.column.stats controls whether the column stats for a table 
 are fetched during explain (in Tez: during query planning). On my setup (1 
 table 4000 partitions, 24 columns) the time spent in semantic analyze goes 
 from ~1 second to ~66 seconds when turning the flag on. 65 seconds spent 
 fetching column stats...
 The reason is probably that the APIs force you to make separate metastore 
 calls for each column in each partition. That's probably the first thing that 
 has to change. The question is if in addition to that we need to cache this 
 in the client or store the stats as a single blob in the database to further 
 cut down on the time. However, the way it stands right now column stats seem 
 unusable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17162: HIVE-6157 Fetching column stats slow

2014-01-22 Thread Gunther Hagleitner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17162/#review32600
---


With this can we set the default of: hive.stats.fetch.column.stats to true?
Can you please run TestMiniTezCliDriver (-Phadoop-2)? There might be some 
changes with this. (Precommit only runs hadoop-1)

- Gunther Hagleitner


On Jan. 23, 2014, 12:03 a.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17162/
 ---
 
 (Updated Jan. 23, 2014, 12:03 a.m.)
 
 
 Review request for hive and Gunther Hagleitner.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira
 
 
 Diffs
 -
 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
  1b01238 
   metastore/if/hive_metastore.thrift e4e816d 
   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 58f9957 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
 ed05790 
   metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
 4288781 
   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
 16f43e9 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 794fadd 
   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java f5ea2ef 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
 PRE-CREATION 
   
 metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
  003dc9c 
   
 metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
  6dd0852 
   
 metastore/src/test/org/apache/hadoop/hive/metastore/VerifyingObjectStore.java 
 c683fc9 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java d32deea 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java 608bef2 
   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java b815ea2 
   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 384b49e 
   ql/src/test/results/clientpositive/metadataonly1.q.out 3500fd2 
 
 Diff: https://reviews.apache.org/r/17162/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin