[jira] [Updated] (HIVE-6468) HS2 out of memory error when curl sends a get request

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6468:


Attachment: HIVE-6468.2.patch.txt

Good to know that someone is interested in this. Rebased to trunk.

 HS2 out of memory error when curl sends a get request
 -

 Key: HIVE-6468
 URL: https://issues.apache.org/jira/browse/HIVE-6468
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
 Environment: Centos 6.3, hive 12, hadoop-2.2
Reporter: Abin Shahab
Assignee: Navis
 Attachments: HIVE-6468.1.patch.txt, HIVE-6468.2.patch.txt


 We see an out of memory error when we run simple beeline calls.
 (The hive.server2.transport.mode is binary)
 curl localhost:1
 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap 
 space
   at 
 org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181)
   at 
 org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
   at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
   at 
 org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
   at 
 org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HIVE-7211) Throws exception if the name of conf var starts with hive. does not exists in HiveConf

2014-06-26 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043127#comment-14043127
 ] 

Navis edited comment on HIVE-7211 at 6/26/14 6:04 AM:
--

This adds several configuration parameters to HiveConf.java: 

* hive.test.dummystats.aggregator (internal)
* hive.test.dummystats.publisher (internal)
* hive.io.rcfile.record.interval
* hive.io.rcfile.column.number.conf
* hive.io.rcfile.tolerate.corruptions
* hive.io.rcfile.record.buffer.size
* hive.hbase.generatehfiles
* hive.index.compact.file (internal)
* hive.index.blockfilter.file (internal)

Except for the internal parameters, they need definitions in 
hive-default.xml.template.  Then they should be documented in the wiki 
(https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties).


was (Author: le...@hortonworks.com):
This adds several configuration parameters to HiveConf.java: 

* hive.test.dummystats.aggregator
* hive.test.dummystats.publisher
* hive.io.rcfile.record.interval
* hive.io.rcfile.column.number.conf
* hive.io.rcfile.tolerate.corruptions
* hive.io.rcfile.record.buffer.size
* hive.hbase.generatehfiles
* hive.index.compact.file (internal)
* hive.index.blockfilter.file (internal)

Except for the internal parameters, they need definitions in 
hive-default.xml.template.  Then they should be documented in the wiki 
(https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties).

 Throws exception if the name of conf var starts with hive. does not exists 
 in HiveConf
 

 Key: HIVE-7211
 URL: https://issues.apache.org/jira/browse/HIVE-7211
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Navis
Assignee: Navis
Priority: Trivial
  Labels: TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-7211.1.patch.txt, HIVE-7211.2.patch.txt, 
 HIVE-7211.3.patch.txt, HIVE-7211.4.patch.txt


 Some typos in configurations are very hard to find.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6637) UDF in_file() doesn't take CHAR or VARCHAR as input

2014-06-26 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044376#comment-14044376
 ] 

Szehon Ho commented on HIVE-6637:
-

Udf_infile test is failing, which looks related..  Can you make sure it passes 
first before commit?

 UDF in_file() doesn't take CHAR or VARCHAR as input
 ---

 Key: HIVE-6637
 URL: https://issues.apache.org/jira/browse/HIVE-6637
 Project: Hive
  Issue Type: Bug
  Components: Types, UDF
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ashish Kumar Singh
 Attachments: HIVE-6637.1.patch, HIVE-6637.2.patch


 {code}
 hive desc alter_varchar_1;
 key   string  None
 value varchar(3)  None
 key2  int None
 value2varchar(10) None
 hive select in_file(value, value2) from alter_varchar_1;
 FAILED: SemanticException [Error 10016]: Line 1:15 Argument type mismatch 
 'value': The 1st argument of function IN_FILE must be a string but 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableHiveVarcharObjectInspector@10f1f34a
  was given.
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-2379) Hive/HBase integration could be improved

2014-06-26 Thread Gautam Gopalakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gautam Gopalakrishnan updated HIVE-2379:


Description: 
  For now any Hive/HBase queries would require the following jars to be 
explicitly added via hive's add jar command:

add jar /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar;
add jar /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar;
add jar /usr/lib/hive/lib/zookeeper-3.3.1.jar;
add jar /usr/lib/hive/lib/guava-r06.jar;

the longer term solution, perhaps, should be to have the code at submit time 
call hbase's 
TableMapREduceUtil.addDependencyJar(job, HBaseStorageHandler.class) to ship it 
in distributedcache.

  was:
For now any Hive/HBase queries would require the following jars to be 
explicitly added via hive's add jar command:

add jar /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar;
add jar /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar;
add jar /usr/lib/hive/lib/zookeeper-3.3.1.jar;
add jar /usr/lib/hive/lib/guava-r06.jar;

the longer term solution, perhaps, should be to have the code at submit time 
call hbase's 
TableMapREduceUtil.addDependencyJar(job, HBaseStorageHandler.class) to ship it 
in distributedcache.


 Hive/HBase integration could be improved
 

 Key: HIVE-2379
 URL: https://issues.apache.org/jira/browse/HIVE-2379
 Project: Hive
  Issue Type: Bug
  Components: CLI, Clients, HBase Handler
Affects Versions: 0.7.1, 0.8.0, 0.9.0
Reporter: Roman Shaposhnik
Assignee: Navis
Priority: Critical
 Fix For: 0.12.0

 Attachments: HIVE-2379-0.11.patch.txt, HIVE-2379.D7347.1.patch, 
 HIVE-2379.D7347.2.patch, HIVE-2379.D7347.3.patch


   For now any Hive/HBase queries would require the following jars to be 
 explicitly added via hive's add jar command:
 add jar /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar;
 add jar /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar;
 add jar /usr/lib/hive/lib/zookeeper-3.3.1.jar;
 add jar /usr/lib/hive/lib/guava-r06.jar;
 the longer term solution, perhaps, should be to have the code at submit time 
 call hbase's 
 TableMapREduceUtil.addDependencyJar(job, HBaseStorageHandler.class) to ship 
 it in distributedcache.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6394) Implement Timestmap in ParquetSerde

2014-06-26 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044381#comment-14044381
 ] 

Szehon Ho commented on HIVE-6394:
-

Ah got it, thanks.  Looks good, just one (unrelated) note, as HIVE-6375 is 
committed in 0.13, should we qualify the CTAS limitation?

 Implement Timestmap in ParquetSerde
 ---

 Key: HIVE-6394
 URL: https://issues.apache.org/jira/browse/HIVE-6394
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers
Reporter: Jarek Jarcec Cecho
Assignee: Szehon Ho
  Labels: Parquet, TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-6394.2.patch, HIVE-6394.3.patch, HIVE-6394.4.patch, 
 HIVE-6394.5.patch, HIVE-6394.6.patch, HIVE-6394.6.patch, HIVE-6394.7.patch, 
 HIVE-6394.patch


 This JIRA is to implement timestamp support in Parquet SerDe.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)

2014-06-26 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044384#comment-14044384
 ] 

Szehon Ho commented on HIVE-7220:
-

Forgot to rebase.  Thank you [~hagleitn] for that.

 Empty dir in external table causes issue (root_dir_external_table.q failure)
 

 Key: HIVE-7220
 URL: https://issues.apache.org/jira/browse/HIVE-7220
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, 
 HIVE-7220.patch


 While looking at root_dir_external_table.q failure, which is doing a query on 
 an external table located at root ('/'), I noticed that latest Hadoop2 
 CombineFileInputFormat returns split representing empty directories (like 
 '/Users'), which leads to failure in Hive's CombineFileRecordReader as it 
 tries to open the directory for processing.
 Tried with an external table in a normal HDFS directory, and it also returns 
 the same error.  Looks like a real bug.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7211) Throws exception if the name of conf var starts with hive. does not exists in HiveConf

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7211:


Labels:   (was: TODOC14)

 Throws exception if the name of conf var starts with hive. does not exists 
 in HiveConf
 

 Key: HIVE-7211
 URL: https://issues.apache.org/jira/browse/HIVE-7211
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.14.0

 Attachments: HIVE-7211.1.patch.txt, HIVE-7211.2.patch.txt, 
 HIVE-7211.3.patch.txt, HIVE-7211.4.patch.txt


 Some typos in configurations are very hard to find.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7127) Handover more details on exception in hiveserver2

2014-06-26 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044386#comment-14044386
 ] 

Szehon Ho commented on HIVE-7127:
-

Thanks, +1 pending test

 Handover more details on exception in hiveserver2
 -

 Key: HIVE-7127
 URL: https://issues.apache.org/jira/browse/HIVE-7127
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-7127.1.patch.txt, HIVE-7127.2.patch.txt, 
 HIVE-7127.4.patch.txt, HIVE-7127.5.patch.txt


 NO PRECOMMIT TESTS
 Currently, JDBC hands over exception message and error codes. But it's not 
 helpful for debugging.
 {noformat}
 org.apache.hive.service.cli.HiveSQLException: Error while compiling 
 statement: FAILED: ParseException line 1:0 cannot recognize input near 
 'createa' 'asd' 'EOF'
   at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121)
   at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:109)
   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:231)
   at org.apache.hive.beeline.Commands.execute(Commands.java:736)
   at org.apache.hive.beeline.Commands.sql(Commands.java:657)
   at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:889)
   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:744)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:459)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:442)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
 {noformat}
 With this patch, JDBC client can get more details on hiveserver2. 
 {noformat}
 Caused by: org.apache.hive.service.cli.HiveSQLException: Error while 
 compiling statement: FAILED: ParseException line 1:0 cannot recognize input 
 near 'createa' 'asd' 'EOF'
   at org.apache.hive.service.cli.operation.SQLOperation.prepare(Unknown 
 Source)
   at org.apache.hive.service.cli.operation.SQLOperation.run(Unknown 
 Source)
   at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(Unknown
  Source)
   at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(Unknown
  Source)
   at org.apache.hive.service.cli.CLIService.executeStatementAsync(Unknown 
 Source)
   at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(Unknown 
 Source)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(Unknown
  Source)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(Unknown
  Source)
   at org.apache.thrift.ProcessFunction.process(Unknown Source)
   at org.apache.thrift.TBaseProcessor.process(Unknown Source)
   at org.apache.hive.service.auth.TSetIpAddressProcessor.process(Unknown 
 Source)
   at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown 
 Source)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
   at java.lang.Thread.run(Unknown Source)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7232) VectorReduceSink is emitting incorrect JOIN keys

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044397#comment-14044397
 ] 

Hive QA commented on HIVE-7232:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652546/HIVE-7232.2.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5654 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/594/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/594/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-594/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652546

 VectorReduceSink is emitting incorrect JOIN keys
 

 Key: HIVE-7232
 URL: https://issues.apache.org/jira/browse/HIVE-7232
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-7232-extra-logging.patch, HIVE-7232.1.patch.txt, 
 HIVE-7232.2.patch.txt, q5.explain.txt, q5.sql


 After HIVE-7121, tpc-h query5 has resulted in incorrect results.
 Thanks to [~navis], it has been tracked down to the auto-parallel settings 
 which were initialized for ReduceSinkOperator, but not for 
 VectorReduceSinkOperator. The vector version inherits, but doesn't call 
 super.initializeOp() or set up the variable correctly from ReduceSinkDesc.
 The query is tpc-h query5, with extra NULL checks just to be sure.
 {code}
 ELECT n_name,
sum(l_extendedprice * (1 - l_discount)) AS revenue
 FROM customer,
  orders,
  lineitem,
  supplier,
  nation,
  region
 WHERE c_custkey = o_custkey
   AND l_orderkey = o_orderkey
   AND l_suppkey = s_suppkey
   AND c_nationkey = s_nationkey
   AND s_nationkey = n_nationkey
   AND n_regionkey = r_regionkey
   AND r_name = 'ASIA'
   AND o_orderdate = '1994-01-01'
   AND o_orderdate  '1995-01-01'
   and l_orderkey is not null
   and c_custkey is not null
   and l_suppkey is not null
   and c_nationkey is not null
   and s_nationkey is not null
   and n_regionkey is not null
 GROUP BY n_name
 ORDER BY revenue DESC;
 {code}
 The reducer which has the issue has the following plan
 {code}
 Reducer 3
 Reduce Operator Tree:
   Join Operator
 condition map:
  Inner Join 0 to 1
 condition expressions:
   0 {KEY.reducesinkkey0} {VALUE._col2}
   1 {VALUE._col0} {KEY.reducesinkkey0} {VALUE._col3}
 outputColumnNames: _col0, _col3, _col10, _col11, _col14
 Statistics: Num rows: 18344 Data size: 95229140992 Basic 
 stats: COMPLETE Column stats: NONE
 Reduce Output Operator
   key expressions: _col10 (type: int)
   sort order: +
   Map-reduce partition columns: _col10 (type: int)
   Statistics: Num rows: 18344 Data size: 95229140992 
 Basic stats: COMPLETE Column stats: NONE
   value expressions: _col0 (type: int), _col3 (type: int), 
 _col11 (type: int), _col14 (type: string)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6394) Implement Timestmap in ParquetSerde

2014-06-26 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044402#comment-14044402
 ] 

Lefty Leverenz commented on HIVE-6394:
--

Yes, good catch.  But apparently HIVE-6375 doesn't provide column rename 
support for Parquet -- is there another JIRA ticket for that?  (I'll edit the 
wiki and continue this discussion in HIVE-6375 comments.)

 Implement Timestmap in ParquetSerde
 ---

 Key: HIVE-6394
 URL: https://issues.apache.org/jira/browse/HIVE-6394
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers
Reporter: Jarek Jarcec Cecho
Assignee: Szehon Ho
  Labels: Parquet, TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-6394.2.patch, HIVE-6394.3.patch, HIVE-6394.4.patch, 
 HIVE-6394.5.patch, HIVE-6394.6.patch, HIVE-6394.6.patch, HIVE-6394.7.patch, 
 HIVE-6394.patch


 This JIRA is to implement timestamp support in Parquet SerDe.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6375) Fix CTAS for parquet

2014-06-26 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044407#comment-14044407
 ] 

Lefty Leverenz commented on HIVE-6375:
--

Support for CTAS is documented in the wiki:

* [Language Manual -- Parquet -- Limitations | 
https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Limitations]

If there's another JIRA ticket for column rename, please link it to this 
ticket.  For now, the wiki continues to cite this ticket for column rename.

 Fix CTAS for parquet
 

 Key: HIVE-6375
 URL: https://issues.apache.org/jira/browse/HIVE-6375
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Szehon Ho
Priority: Critical
  Labels: Parquet
 Fix For: 0.13.0

 Attachments: HIVE-6375.2.patch, HIVE-6375.3.patch, HIVE-6375.4.patch, 
 HIVE-6375.patch


 More details here:
 https://github.com/Parquet/parquet-mr/issues/272



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6394) Implement Timestmap in ParquetSerde

2014-06-26 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-6394:
-

Labels: Parquet  (was: Parquet TODOC14)

 Implement Timestmap in ParquetSerde
 ---

 Key: HIVE-6394
 URL: https://issues.apache.org/jira/browse/HIVE-6394
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers
Reporter: Jarek Jarcec Cecho
Assignee: Szehon Ho
  Labels: Parquet
 Fix For: 0.14.0

 Attachments: HIVE-6394.2.patch, HIVE-6394.3.patch, HIVE-6394.4.patch, 
 HIVE-6394.5.patch, HIVE-6394.6.patch, HIVE-6394.6.patch, HIVE-6394.7.patch, 
 HIVE-6394.patch


 This JIRA is to implement timestamp support in Parquet SerDe.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5976) Decouple input formats from STORED as keywords

2014-06-26 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chen updated HIVE-5976:
-

Attachment: HIVE-5976.2.patch

I took a shot at rebasing Brock's patch on the current trunk. Uploading patch 
for pre-commit testing.

 Decouple input formats from STORED as keywords
 --

 Key: HIVE-5976
 URL: https://issues.apache.org/jira/browse/HIVE-5976
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-5976.2.patch, HIVE-5976.patch, HIVE-5976.patch, 
 HIVE-5976.patch, HIVE-5976.patch


 As noted in HIVE-5783, we hard code the input formats mapped to keywords. 
 It'd be nice if there was a registration system so we didn't need to do that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5976) Decouple input formats from STORED as keywords

2014-06-26 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chen updated HIVE-5976:
-

Status: Patch Available  (was: Open)

 Decouple input formats from STORED as keywords
 --

 Key: HIVE-5976
 URL: https://issues.apache.org/jira/browse/HIVE-5976
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-5976.2.patch, HIVE-5976.patch, HIVE-5976.patch, 
 HIVE-5976.patch, HIVE-5976.patch


 As noted in HIVE-5783, we hard code the input formats mapped to keywords. 
 It'd be nice if there was a registration system so we didn't need to do that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-1662) Add file pruning into Hive.

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044435#comment-14044435
 ] 

Hive QA commented on HIVE-1662:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652549/HIVE-1662.16.patch.txt

{color:red}ERROR:{color} -1 due to 47 failed/errored test(s), 5670 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join30
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join31
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_rearrange
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input39
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join28
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join29
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join31
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join35
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_star
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_hook
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_subquery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_subquery2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_join_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonblock_op_deduplicate
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_reduce_deduplicate_exclude_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subq_where_serialization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_context
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_nested_mapjoin
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_index_compact_entry_limit
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_index_compact_size_limit
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/595/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/595/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-595/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 47 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652549

 Add file pruning into Hive.
 ---

 Key: HIVE-1662
 URL: https://issues.apache.org/jira/browse/HIVE-1662
 Project: Hive
  Issue Type: New Feature
Reporter: He Yongqiang
Assignee: Navis
 Attachments: HIVE-1662.10.patch.txt, HIVE-1662.11.patch.txt, 
 HIVE-1662.12.patch.txt, HIVE-1662.13.patch.txt, 

[jira] [Commented] (HIVE-7024) Escape control characters for explain result

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044482#comment-14044482
 ] 

Hive QA commented on HIVE-7024:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652550/HIVE-7024.3.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5669 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/596/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/596/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-596/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652550

 Escape control characters for explain result
 

 Key: HIVE-7024
 URL: https://issues.apache.org/jira/browse/HIVE-7024
 Project: Hive
  Issue Type: Bug
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-7024.1.patch.txt, HIVE-7024.2.patch.txt, 
 HIVE-7024.3.patch.txt


 Comments for columns are now delimited by 0x00, which is binary and make git 
 refuse to make proper diff file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7296) big data approximate processing at a very low cost based on hive sql

2014-06-26 Thread wangmeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangmeng updated HIVE-7296:
---

Description: 
For big data analysis, we often need to do the following query and statistics:

1.Cardinality Estimation,   count the number of different elements in the 
collection, such as Unique Visitor ,UV)

Now we can use hive-query:
Select distinct(id)  from TestTable ;

2.Frequency Estimation: estimate number of an element is repeated, such as the 
site visits of  a user 。

Hive query: select  count(1)  from TestTable where name=”wangmeng”

3.Heavy Hitters, top-k elements: such as top-100 shops 

Hive query: select count(1), name  from TestTable  group by name ;  need UDF……

4.Range Query: for example, to find out the number of  users between 20 to 30

Hive query : select  count(1) from TestTable where age20 and age 30

5.Membership Query : for example, whether  the user name is already registered?

According to the implementation mechanism of hive , it  will cost too large 
memory space and a long query time.

However ,in many cases, we do not need very accurate results and a small error 
can be tolerated. In such case  , we can use  approximate processing  to 
greatly improve the time and space efficiency.

Now , based  on some theoretical analysis materials ,I want to  do some for 
these new features so much  

I am familiar with hive and  hadoop , and  I have implemented an efficient  
storage format based on hive.( 
https://github.com/sjtufighter/Data---Storage--).

So, is there anything I can do ?  Many Thanks.


  was:
For big data analysis, we often need to do the following query and statistics:

1.Cardinality Estimation,   count the number of different elements in the 
collection, such as Unique Visitor ,UV)

Now we can use hive-query:
Select distinct(id)  from TestTable ;

2.Frequency Estimation: estimate number of an element is repeated, such as the 
site visits of  a user 。

Hive query: select  count(1)  from TestTable where name=”wangmeng”

3.Heavy Hitters, top-k elements: such as top-100 shops 

Hive query: select count(1), name  from TestTable  group by name ;  need UDF……

4.Range Query: for example, to find out the number of  users between 20 to 30

Hive query : select  count(1) from TestTable where age20 and age 30

5.Membership Query : for example, whether  the user name is already registered?

According to the implementation mechanism of hive , it  will cost too large 
memory space and a long query time.

However ,in many cases, we do not need very accurate results and a small error 
can be tolerated. In such case  , we can use  approximate processing  to 
greatly improve the time and space efficiency.

Now , based  on some theoretical analysis materials ,I want to  do some for 
these new features so much .

I am familiar with hive and  hadoop , and  I have implemented an efficient  
storage format based on hive.( 
https://github.com/sjtufighter/Data---Storage--).

So, is there anything I can do ?  Many Thanks.



 big data approximate processing  at a very  low cost  based on hive sql 
 

 Key: HIVE-7296
 URL: https://issues.apache.org/jira/browse/HIVE-7296
 Project: Hive
  Issue Type: New Feature
Reporter: wangmeng

 For big data analysis, we often need to do the following query and statistics:
 1.Cardinality Estimation,   count the number of different elements in the 
 collection, such as Unique Visitor ,UV)
 Now we can use hive-query:
 Select distinct(id)  from TestTable ;
 2.Frequency Estimation: estimate number of an element is repeated, such as 
 the site visits of  a user 。
 Hive query: select  count(1)  from TestTable where name=”wangmeng”
 3.Heavy Hitters, top-k elements: such as top-100 shops 
 Hive query: select count(1), name  from TestTable  group by name ;  need UDF……
 4.Range Query: for example, to find out the number of  users between 20 to 30
 Hive query : select  count(1) from TestTable where age20 and age 30
 5.Membership Query : for example, whether  the user name is already 
 registered?
 According to the implementation mechanism of hive , it  will cost too large 
 memory space and a long query time.
 However ,in many cases, we do not need very accurate results and a small 
 error can be tolerated. In such case  , we can use  approximate processing  
 to greatly improve the time and space efficiency.
 Now , based  on some theoretical analysis materials ,I want to  do some for 
 these new features so much  
 I am familiar with hive and  hadoop , and  I have implemented an efficient  
 storage format based on hive.( 
 https://github.com/sjtufighter/Data---Storage--).
 So, is there anything I can do ?  Many Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7296) big data approximate processing at a very low cost based on hive sql

2014-06-26 Thread wangmeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangmeng updated HIVE-7296:
---

Description: 
For big data analysis, we often need to do the following query and statistics:

1.Cardinality Estimation,   count the number of different elements in the 
collection, such as Unique Visitor ,UV)

Now we can use hive-query:
Select distinct(id)  from TestTable ;

2.Frequency Estimation: estimate number of an element is repeated, such as the 
site visits of  a user 。

Hive query: select  count(1)  from TestTable where name=”wangmeng”

3.Heavy Hitters, top-k elements: such as top-100 shops 

Hive query: select count(1), name  from TestTable  group by name ;  need UDF……

4.Range Query: for example, to find out the number of  users between 20 to 30

Hive query : select  count(1) from TestTable where age20 and age 30

5.Membership Query : for example, whether  the user name is already registered?

According to the implementation mechanism of hive , it  will cost too large 
memory space and a long query time.

However ,in many cases, we do not need very accurate results and a small error 
can be tolerated. In such case  , we can use  approximate processing  to 
greatly improve the time and space efficiency.

Now , based  on some theoretical analysis materials ,I want to  do some for 
these new features so much if possible. .

I am familiar with hive and  hadoop , and  I have implemented an efficient  
storage format based on hive.( 
https://github.com/sjtufighter/Data---Storage--).

So, is there anything I can do ?  Many Thanks.


  was:
For big data analysis, we often need to do the following query and statistics:

1.Cardinality Estimation,   count the number of different elements in the 
collection, such as Unique Visitor ,UV)

Now we can use hive-query:
Select distinct(id)  from TestTable ;

2.Frequency Estimation: estimate number of an element is repeated, such as the 
site visits of  a user 。

Hive query: select  count(1)  from TestTable where name=”wangmeng”

3.Heavy Hitters, top-k elements: such as top-100 shops 

Hive query: select count(1), name  from TestTable  group by name ;  need UDF……

4.Range Query: for example, to find out the number of  users between 20 to 30

Hive query : select  count(1) from TestTable where age20 and age 30

5.Membership Query : for example, whether  the user name is already registered?

According to the implementation mechanism of hive , it  will cost too large 
memory space and a long query time.

However ,in many cases, we do not need very accurate results and a small error 
can be tolerated. In such case  , we can use  approximate processing  to 
greatly improve the time and space efficiency.

Now , based  on some theoretical analysis materials ,I want to  do some for 
these new features so much  

I am familiar with hive and  hadoop , and  I have implemented an efficient  
storage format based on hive.( 
https://github.com/sjtufighter/Data---Storage--).

So, is there anything I can do ?  Many Thanks.



 big data approximate processing  at a very  low cost  based on hive sql 
 

 Key: HIVE-7296
 URL: https://issues.apache.org/jira/browse/HIVE-7296
 Project: Hive
  Issue Type: New Feature
Reporter: wangmeng

 For big data analysis, we often need to do the following query and statistics:
 1.Cardinality Estimation,   count the number of different elements in the 
 collection, such as Unique Visitor ,UV)
 Now we can use hive-query:
 Select distinct(id)  from TestTable ;
 2.Frequency Estimation: estimate number of an element is repeated, such as 
 the site visits of  a user 。
 Hive query: select  count(1)  from TestTable where name=”wangmeng”
 3.Heavy Hitters, top-k elements: such as top-100 shops 
 Hive query: select count(1), name  from TestTable  group by name ;  need UDF……
 4.Range Query: for example, to find out the number of  users between 20 to 30
 Hive query : select  count(1) from TestTable where age20 and age 30
 5.Membership Query : for example, whether  the user name is already 
 registered?
 According to the implementation mechanism of hive , it  will cost too large 
 memory space and a long query time.
 However ,in many cases, we do not need very accurate results and a small 
 error can be tolerated. In such case  , we can use  approximate processing  
 to greatly improve the time and space efficiency.
 Now , based  on some theoretical analysis materials ,I want to  do some for 
 these new features so much if possible. .
 I am familiar with hive and  hadoop , and  I have implemented an efficient  
 storage format based on hive.( 
 https://github.com/sjtufighter/Data---Storage--).
 So, is there anything I can do ?  Many Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7299) Enable metadata only optimization on Tez

2014-06-26 Thread Gunther Hagleitner (JIRA)
Gunther Hagleitner created HIVE-7299:


 Summary: Enable metadata only optimization on Tez
 Key: HIVE-7299
 URL: https://issues.apache.org/jira/browse/HIVE-7299
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner


Enables the metadata only optimization (the one with OneNullRowInputFormat not 
the query-result-from-stats optimizaton)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 23006: Escape control characters for explain result

2014-06-26 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23006/
---

Review request for hive.


Bugs: HIVE-7024
https://issues.apache.org/jira/browse/HIVE-7024


Repository: hive-git


Description
---

Comments for columns are now delimited by 0x00, which is binary and make git 
refuse to make proper diff file.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 92545d8 
  ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 1149bda 
  ql/src/test/results/clientpositive/alter_partition_coltype.q.out e86cc06 
  ql/src/test/results/clientpositive/annotate_stats_filter.q.out c7d58f6 
  ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 6f72964 
  ql/src/test/results/clientpositive/annotate_stats_join.q.out cc816c8 
  ql/src/test/results/clientpositive/annotate_stats_part.q.out a0b4602 
  ql/src/test/results/clientpositive/annotate_stats_select.q.out 97e9473 
  ql/src/test/results/clientpositive/annotate_stats_table.q.out bb2d18c 
  ql/src/test/results/clientpositive/annotate_stats_union.q.out 6d179b6 
  ql/src/test/results/clientpositive/auto_join_reordering_values.q.out 3f4f902 
  ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 72640df 
  ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out c660cd0 
  ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 4abda32 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 52a3194 
  ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out d807791 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 35e0a30 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out af3d9d6 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out 05ef5d8 
  ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out e423d14 
  ql/src/test/results/clientpositive/binary_output_format.q.out 294aabb 
  ql/src/test/results/clientpositive/bucket1.q.out f3eb15c 
  ql/src/test/results/clientpositive/bucket2.q.out 9a22160 
  ql/src/test/results/clientpositive/bucket3.q.out 8fa9c7b 
  ql/src/test/results/clientpositive/bucket4.q.out 032272b 
  ql/src/test/results/clientpositive/bucket5.q.out d19fbe5 
  ql/src/test/results/clientpositive/bucket_map_join_1.q.out 8674a6c 
  ql/src/test/results/clientpositive/bucket_map_join_2.q.out 8a5984d 
  ql/src/test/results/clientpositive/bucketcontext_1.q.out 1513515 
  ql/src/test/results/clientpositive/bucketcontext_2.q.out d18a9be 
  ql/src/test/results/clientpositive/bucketcontext_3.q.out e12c155 
  ql/src/test/results/clientpositive/bucketcontext_4.q.out 77b4882 
  ql/src/test/results/clientpositive/bucketcontext_5.q.out fa1cfc5 
  ql/src/test/results/clientpositive/bucketcontext_6.q.out aac66f8 
  ql/src/test/results/clientpositive/bucketcontext_7.q.out 78c4f94 
  ql/src/test/results/clientpositive/bucketcontext_8.q.out ad7fec9 
  ql/src/test/results/clientpositive/bucketmapjoin1.q.out 10f1af4 
  ql/src/test/results/clientpositive/bucketmapjoin10.q.out 88ecf40 
  ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4ee1fa0 
  ql/src/test/results/clientpositive/bucketmapjoin12.q.out 9253f4a 
  ql/src/test/results/clientpositive/bucketmapjoin13.q.out b380fab 
  ql/src/test/results/clientpositive/bucketmapjoin2.q.out 297412f 
  ql/src/test/results/clientpositive/bucketmapjoin3.q.out 7f307a0 
  ql/src/test/results/clientpositive/bucketmapjoin4.q.out f0f9aee 
  ql/src/test/results/clientpositive/bucketmapjoin5.q.out 79e1c3d 
  ql/src/test/results/clientpositive/bucketmapjoin7.q.out 76baf50 
  ql/src/test/results/clientpositive/bucketmapjoin8.q.out 94fdbde 
  ql/src/test/results/clientpositive/bucketmapjoin9.q.out c9f4c17 
  ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 751e32f 
  ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 3eb70d1 
  ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 34abe4f 
  ql/src/test/results/clientpositive/columnstats_partlvl.q.out 6128770 
  ql/src/test/results/clientpositive/columnstats_tbllvl.q.out 35af846 
  ql/src/test/results/clientpositive/ctas.q.out 0040f3c 
  ql/src/test/results/clientpositive/disable_merge_for_bucketing.q.out 4f9bb94 
  ql/src/test/results/clientpositive/display_colstats_tbllvl.q.out 8b7afae 
  ql/src/test/results/clientpositive/filter_join_breaktask.q.out b379f86 
  ql/src/test/results/clientpositive/groupby_map_ppr.q.out c7ca521 
  ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out 
00e2b6d 
  ql/src/test/results/clientpositive/groupby_ppr.q.out 57e886d 
  ql/src/test/results/clientpositive/groupby_ppr_multi_distinct.q.out f8073ff 
  ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 38a0678 
  ql/src/test/results/clientpositive/groupby_sort_6.q.out ca5ad8f 
  ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out ac54e7d 
  

[jira] [Updated] (HIVE-7299) Enable metadata only optimization on Tez

2014-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7299:
-

Status: Patch Available  (was: Open)

 Enable metadata only optimization on Tez
 

 Key: HIVE-7299
 URL: https://issues.apache.org/jira/browse/HIVE-7299
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-7299.1.patch


 Enables the metadata only optimization (the one with OneNullRowInputFormat 
 not the query-result-from-stats optimizaton)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7299) Enable metadata only optimization on Tez

2014-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7299:
-

Attachment: HIVE-7299.1.patch

 Enable metadata only optimization on Tez
 

 Key: HIVE-7299
 URL: https://issues.apache.org/jira/browse/HIVE-7299
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-7299.1.patch


 Enables the metadata only optimization (the one with OneNullRowInputFormat 
 not the query-result-from-stats optimizaton)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-1662) Add file pruning into Hive.

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-1662:


Attachment: HIVE-1662.17.patch.txt

Wish this is the final patch.

 Add file pruning into Hive.
 ---

 Key: HIVE-1662
 URL: https://issues.apache.org/jira/browse/HIVE-1662
 Project: Hive
  Issue Type: New Feature
Reporter: He Yongqiang
Assignee: Navis
 Attachments: HIVE-1662.10.patch.txt, HIVE-1662.11.patch.txt, 
 HIVE-1662.12.patch.txt, HIVE-1662.13.patch.txt, HIVE-1662.14.patch.txt, 
 HIVE-1662.15.patch.txt, HIVE-1662.16.patch.txt, HIVE-1662.17.patch.txt, 
 HIVE-1662.8.patch.txt, HIVE-1662.9.patch.txt, HIVE-1662.D8391.1.patch, 
 HIVE-1662.D8391.2.patch, HIVE-1662.D8391.3.patch, HIVE-1662.D8391.4.patch, 
 HIVE-1662.D8391.5.patch, HIVE-1662.D8391.6.patch, HIVE-1662.D8391.7.patch


 now hive support filename virtual column. 
 if a file name filter presents in a query, hive should be able to only add 
 files which passed the filter to input paths.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044524#comment-14044524
 ] 

Hive QA commented on HIVE-7220:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652552/HIVE-7220.4.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5669 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/597/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/597/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-597/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652552

 Empty dir in external table causes issue (root_dir_external_table.q failure)
 

 Key: HIVE-7220
 URL: https://issues.apache.org/jira/browse/HIVE-7220
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, 
 HIVE-7220.patch


 While looking at root_dir_external_table.q failure, which is doing a query on 
 an external table located at root ('/'), I noticed that latest Hadoop2 
 CombineFileInputFormat returns split representing empty directories (like 
 '/Users'), which leads to failure in Hive's CombineFileRecordReader as it 
 tries to open the directory for processing.
 Tried with an external table in a normal HDFS directory, and it also returns 
 the same error.  Looks like a real bug.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7090) Support session-level temporary tables in Hive

2014-06-26 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-7090:
-

Attachment: HIVE-7090.5.patch

Patch v5. The show create table tests were failing due to formatting (1 space 
difference) change in show create table, updated the diffs.
stats19.q was failing because the scratchdir changes included adding the 
session ID to the path, which cause some pathnames to exceed the max prefix 
length used in the tests.  Test seems a bit fragile since it's dependent on the 
path length of the build directory, but increasing the max prefix length before 
hashing should get this to pass.

 Support session-level temporary tables in Hive
 --

 Key: HIVE-7090
 URL: https://issues.apache.org/jira/browse/HIVE-7090
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Gunther Hagleitner
Assignee: Jason Dere
 Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch, HIVE-7090.3.patch, 
 HIVE-7090.4.patch, HIVE-7090.5.patch


 It's common to see sql scripts that create some temporary table as an 
 intermediate result, run some additional queries against it and then clean up 
 at the end.
 We should support temporary tables properly, meaning automatically manage the 
 life cycle and make sure the visibility is restricted to the creating 
 connection/session. Without these it's common to see left over tables in 
 meta-store or weird errors with clashing tmp table names.
 Proposed syntax:
 CREATE TEMPORARY TABLE 
 CTAS, CTL, INSERT INTO, should all be supported as usual.
 Knowing that a user wants a temp table can enable us to further optimize 
 access to it. E.g.: temp tables should be kept in memory where possible, 
 compactions and merging table files aren't required, ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7298) desc database extended does not show properties of the database

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044607#comment-14044607
 ] 

Hive QA commented on HIVE-7298:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652558/HIVE-7298.1.patch.txt

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 5655 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_db_owner
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_owner_actions_db
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database_location
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database_properties
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hive.hcatalog.cli.TestSemanticAnalysis.testDescDB
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/599/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/599/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-599/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652558

 desc database extended does not show properties of the database
 ---

 Key: HIVE-7298
 URL: https://issues.apache.org/jira/browse/HIVE-7298
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-7298.1.patch.txt


 HIVE-6386 added owner information to desc, but not updated schema of it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[GitHub] hive pull request: Fix lock/unlock pairing

2014-06-26 Thread pavel-sakun
GitHub user pavel-sakun opened a pull request:

https://github.com/apache/hive/pull/17

Fix lock/unlock pairing

Prevent IllegalMonitorStateException in case stmtHandle is null

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pavel-sakun/hive 
hive-statement-illegalmonitorstateexception

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/17.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17


commit 9468a23bfe76cd5be5c747998ec0c055750db2d3
Author: Pavel Sakun pavel_sa...@epam.com
Date:   2014-06-26T13:26:38Z

Fix lock/unlock pairing

Prevent IllegalMonitorStateException in case stmtHandle is null




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (HIVE-6468) HS2 out of memory error when curl sends a get request

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044731#comment-14044731
 ] 

Hive QA commented on HIVE-6468:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652560/HIVE-6468.2.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5654 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats_empty_partition
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/600/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/600/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-600/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652560

 HS2 out of memory error when curl sends a get request
 -

 Key: HIVE-6468
 URL: https://issues.apache.org/jira/browse/HIVE-6468
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
 Environment: Centos 6.3, hive 12, hadoop-2.2
Reporter: Abin Shahab
Assignee: Navis
 Attachments: HIVE-6468.1.patch.txt, HIVE-6468.2.patch.txt


 We see an out of memory error when we run simple beeline calls.
 (The hive.server2.transport.mode is binary)
 curl localhost:1
 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap 
 space
   at 
 org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181)
   at 
 org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
   at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
   at 
 org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
   at 
 org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7300) When creating database by specifying location, .db is not created

2014-06-26 Thread sourabh potnis (JIRA)
sourabh potnis created HIVE-7300:


 Summary: When creating database by specifying location, .db is not 
created
 Key: HIVE-7300
 URL: https://issues.apache.org/jira/browse/HIVE-7300
 Project: Hive
  Issue Type: Bug
Reporter: sourabh potnis


When I create a database without specifying location:
e.g. create database test;
it will get created in /apps/hive/warehouse/ as /apps/hive/warehouse/test.db

But when I create database by specifying location:
e.g. create database test_loc location '/addh0010/hive/addh0011/warehouse';
Database will be created but /addh0010/hive/addh0011/warehouse/test_loc does 
not get created.

So if user tries to create 2 tables with same name in two different databases 
at same location. We are not sure if table is created.

So when database is created with location, .db directory should be created with 
that database name at that location.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7300) When creating database by specifying location, .db is not created

2014-06-26 Thread sourabh potnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sourabh potnis updated HIVE-7300:
-

Description: 
When I create a database without specifying location:
e.g. create database test;
it will get created in /apps/hive/warehouse/ as /apps/hive/warehouse/test.db

But when I create database by specifying location:
e.g. create database test_loc location '/addh0010/hive/addh0011/warehouse';
Database will be created but /addh0010/hive/addh0011/warehouse/test_loc.db does 
not get created.

So if user tries to create 2 tables with same name in two different databases 
at same location. We are not sure if table is created.

So when database is created with location, .db directory should be created with 
that database name at that location.


  was:
When I create a database without specifying location:
e.g. create database test;
it will get created in /apps/hive/warehouse/ as /apps/hive/warehouse/test.db

But when I create database by specifying location:
e.g. create database test_loc location '/addh0010/hive/addh0011/warehouse';
Database will be created but /addh0010/hive/addh0011/warehouse/test_loc does 
not get created.

So if user tries to create 2 tables with same name in two different databases 
at same location. We are not sure if table is created.

So when database is created with location, .db directory should be created with 
that database name at that location.



 When creating database by specifying location, .db is not created
 -

 Key: HIVE-7300
 URL: https://issues.apache.org/jira/browse/HIVE-7300
 Project: Hive
  Issue Type: Bug
Reporter: sourabh potnis

 When I create a database without specifying location:
 e.g. create database test;
 it will get created in /apps/hive/warehouse/ as /apps/hive/warehouse/test.db
 But when I create database by specifying location:
 e.g. create database test_loc location '/addh0010/hive/addh0011/warehouse';
 Database will be created but /addh0010/hive/addh0011/warehouse/test_loc.db 
 does not get created.
 So if user tries to create 2 tables with same name in two different databases 
 at same location. We are not sure if table is created.
 So when database is created with location, .db directory should be created 
 with that database name at that location.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7300) When creating database by specifying location, .db is not created

2014-06-26 Thread sourabh potnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sourabh potnis updated HIVE-7300:
-

Labels: .db database location  (was: )

 When creating database by specifying location, .db is not created
 -

 Key: HIVE-7300
 URL: https://issues.apache.org/jira/browse/HIVE-7300
 Project: Hive
  Issue Type: Bug
Reporter: sourabh potnis
  Labels: .db, database, location

 When I create a database without specifying location:
 e.g. create database test;
 it will get created in /apps/hive/warehouse/ as /apps/hive/warehouse/test.db
 But when I create database by specifying location:
 e.g. create database test_loc location '/addh0010/hive/addh0011/warehouse';
 Database will be created but /addh0010/hive/addh0011/warehouse/test_loc.db 
 does not get created.
 So if user tries to create 2 tables with same name in two different databases 
 at same location. We are not sure if table is created.
 So when database is created with location, .db directory should be created 
 with that database name at that location.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5976) Decouple input formats from STORED as keywords

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044793#comment-14044793
 ] 

Hive QA commented on HIVE-5976:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652566/HIVE-5976.2.patch

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 5671 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_file_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_storage_format_descriptor
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ctas
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fileformat_bad_class
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_genericFileFormat
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbortAndCommit
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Delimited
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/601/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/601/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-601/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652566

 Decouple input formats from STORED as keywords
 --

 Key: HIVE-5976
 URL: https://issues.apache.org/jira/browse/HIVE-5976
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-5976.2.patch, HIVE-5976.patch, HIVE-5976.patch, 
 HIVE-5976.patch, HIVE-5976.patch


 As noted in HIVE-5783, we hard code the input formats mapped to keywords. 
 It'd be nice if there was a registration system so we didn't need to do that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5976) Decouple input formats from STORED as keywords

2014-06-26 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044838#comment-14044838
 ] 

Brock Noland commented on HIVE-5976:


Thank you [~davidzchen]! Some of those failures are unrelated... I think the 
partition wise ones are related though.

 Decouple input formats from STORED as keywords
 --

 Key: HIVE-5976
 URL: https://issues.apache.org/jira/browse/HIVE-5976
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-5976.2.patch, HIVE-5976.patch, HIVE-5976.patch, 
 HIVE-5976.patch, HIVE-5976.patch


 As noted in HIVE-5783, we hard code the input formats mapped to keywords. 
 It'd be nice if there was a registration system so we didn't need to do that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6637) UDF in_file() doesn't take CHAR or VARCHAR as input

2014-06-26 Thread Ashish Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Kumar Singh updated HIVE-6637:
-

Attachment: HIVE-6637.3.patch

Add missing data file.

 UDF in_file() doesn't take CHAR or VARCHAR as input
 ---

 Key: HIVE-6637
 URL: https://issues.apache.org/jira/browse/HIVE-6637
 Project: Hive
  Issue Type: Bug
  Components: Types, UDF
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ashish Kumar Singh
 Attachments: HIVE-6637.1.patch, HIVE-6637.2.patch, HIVE-6637.3.patch


 {code}
 hive desc alter_varchar_1;
 key   string  None
 value varchar(3)  None
 key2  int None
 value2varchar(10) None
 hive select in_file(value, value2) from alter_varchar_1;
 FAILED: SemanticException [Error 10016]: Line 1:15 Argument type mismatch 
 'value': The 1st argument of function IN_FILE must be a string but 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableHiveVarcharObjectInspector@10f1f34a
  was given.
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 22772: HIVE-6637: UDF in_file() doesn't take CHAR or VARCHAR as input

2014-06-26 Thread Ashish Singh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22772/
---

(Updated June 26, 2014, 5:12 p.m.)


Review request for hive.


Changes
---

Add missing data file.


Bugs: HIVE-6637
https://issues.apache.org/jira/browse/HIVE-6637


Repository: hive-git


Description
---

HIVE-6637: UDF in_file() doesn't take CHAR or VARCHAR as input


Diffs (updated)
-

  data/files/in_file.dat PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInFile.java 
ea52537d0b85191f0b633a29aa3f7ddb556c288d 
  ql/src/test/queries/clientpositive/udf_in_file.q 
9d9efe8e23d6e73429ee5cd2c8470359ba2b3498 
  ql/src/test/results/clientpositive/udf_in_file.q.out 
b63143760d80f3f6a8ba0a23c0d87e8bb86fce66 

Diff: https://reviews.apache.org/r/22772/diff/


Testing
---

Tested with qtest.


Thanks,

Ashish Singh



[jira] [Commented] (HIVE-6637) UDF in_file() doesn't take CHAR or VARCHAR as input

2014-06-26 Thread Ashish Kumar Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044880#comment-14044880
 ] 

Ashish Kumar Singh commented on HIVE-6637:
--

[~szehon] sorry about not catching this earlier. The patch was missing a data 
file. Updated patch and rb.

 UDF in_file() doesn't take CHAR or VARCHAR as input
 ---

 Key: HIVE-6637
 URL: https://issues.apache.org/jira/browse/HIVE-6637
 Project: Hive
  Issue Type: Bug
  Components: Types, UDF
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ashish Kumar Singh
 Attachments: HIVE-6637.1.patch, HIVE-6637.2.patch, HIVE-6637.3.patch


 {code}
 hive desc alter_varchar_1;
 key   string  None
 value varchar(3)  None
 key2  int None
 value2varchar(10) None
 hive select in_file(value, value2) from alter_varchar_1;
 FAILED: SemanticException [Error 10016]: Line 1:15 Argument type mismatch 
 'value': The 1st argument of function IN_FILE must be a string but 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableHiveVarcharObjectInspector@10f1f34a
  was given.
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7298) desc database extended does not show properties of the database

2014-06-26 Thread Sumit Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044918#comment-14044918
 ] 

Sumit Kumar commented on HIVE-7298:
---

+1

 desc database extended does not show properties of the database
 ---

 Key: HIVE-7298
 URL: https://issues.apache.org/jira/browse/HIVE-7298
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-7298.1.patch.txt


 HIVE-6386 added owner information to desc, but not updated schema of it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7295) FileStatus.getOwner on Windows returns name of group the user belongs to, instead of user name expected, fails many authorization related unit tests

2014-06-26 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044947#comment-14044947
 ] 

Xiaobing Zhou commented on HIVE-7295:
-

Thanks [~cnauroth]. 

Since this is Windows specific, we can use this code snippet for work-around.

 if (user.equals(stat.getOwner())) {
if (dirPerms.getUserAction().implies(action)) {
  continue;
}
  }

if (ArrayUtils.contains(groups, stat.getGroup())) {
if (dirPerms.getGroupAction().implies(action)) {
  continue;
}
  }

 if (dirPerms.getOtherAction().implies(action)) {
continue;
  }

//windows specific add-on
if (Shell.WINDOWS) {
if (ArrayUtils.contains(groups, stat.getOwner())) { //stat.getOwner 
returns administrators 
  if (dirPerms.getUserAction().implies(action)) {
continue;
  }
}
  }


This passes all failed test cases affected, even if login user is a member of 
admin group.

   

 FileStatus.getOwner on Windows returns name of group the user belongs to, 
 instead of user name expected, fails many authorization related unit tests
 

 Key: HIVE-7295
 URL: https://issues.apache.org/jira/browse/HIVE-7295
 Project: Hive
  Issue Type: Bug
  Components: Authorization, HCatalog, Security, Windows
Affects Versions: 0.13.0
 Environment: Windows Server 2008 R2
Reporter: Xiaobing Zhou
Priority: Critical

 Unit test in TestHdfsAuthorizationProvider, e.g. 
 org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testTableOps. 
 fails to run.
 Running org.apache.hcatalog.security.TestHdfsAuthorizationProvider
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 15.799 sec 
  FAILURE! - in org.apache.hcatalog.security.TestHdfsAuthorizationProvider
 testTableOps(org.apache.hcatalog.security.TestHdfsAuthorizationProvider)  
 Time elapsed: 15.546 sec   FAILURE!
 junit.framework.AssertionFailedError: FAILED: AuthorizationException 
 org.apache.hadoop.security.AccessControlException: action WRITE not permitted 
 on path pfile:/Users/xz
 hou/hworks/workspace/hwx-hive-ws/hive/hcatalog/core/target/warehouse for user 
 xzhou expected:0 but was:4
 at junit.framework.Assert.fail(Assert.java:50)
 at junit.framework.Assert.failNotEquals(Assert.java:287)
 at junit.framework.Assert.assertEquals(Assert.java:67)
 at junit.framework.Assert.assertEquals(Assert.java:199)
 at 
 org.apache.hcatalog.security.TestHdfsAuthorizationProvider.exec(TestHdfsAuthorizationProvider.java:172)
 at 
 org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testTableOps(TestHdfsAuthorizationProvider.java:307)
 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 22772: HIVE-6637: UDF in_file() doesn't take CHAR or VARCHAR as input

2014-06-26 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22772/#review46768
---



data/files/in_file.dat
https://reviews.apache.org/r/22772/#comment82335

Could we reuse existing file instead of creating new ones?


- Xuefu Zhang


On June 26, 2014, 5:12 p.m., Ashish Singh wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/22772/
 ---
 
 (Updated June 26, 2014, 5:12 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6637
 https://issues.apache.org/jira/browse/HIVE-6637
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-6637: UDF in_file() doesn't take CHAR or VARCHAR as input
 
 
 Diffs
 -
 
   data/files/in_file.dat PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInFile.java 
 ea52537d0b85191f0b633a29aa3f7ddb556c288d 
   ql/src/test/queries/clientpositive/udf_in_file.q 
 9d9efe8e23d6e73429ee5cd2c8470359ba2b3498 
   ql/src/test/results/clientpositive/udf_in_file.q.out 
 b63143760d80f3f6a8ba0a23c0d87e8bb86fce66 
 
 Diff: https://reviews.apache.org/r/22772/diff/
 
 
 Testing
 ---
 
 Tested with qtest.
 
 
 Thanks,
 
 Ashish Singh
 




[jira] [Commented] (HIVE-1662) Add file pruning into Hive.

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044963#comment-14044963
 ] 

Hive QA commented on HIVE-1662:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652584/HIVE-1662.17.patch.txt

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 5655 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_index_compact_entry_limit
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_index_compact_size_limit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/602/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/602/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-602/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652584

 Add file pruning into Hive.
 ---

 Key: HIVE-1662
 URL: https://issues.apache.org/jira/browse/HIVE-1662
 Project: Hive
  Issue Type: New Feature
Reporter: He Yongqiang
Assignee: Navis
 Attachments: HIVE-1662.10.patch.txt, HIVE-1662.11.patch.txt, 
 HIVE-1662.12.patch.txt, HIVE-1662.13.patch.txt, HIVE-1662.14.patch.txt, 
 HIVE-1662.15.patch.txt, HIVE-1662.16.patch.txt, HIVE-1662.17.patch.txt, 
 HIVE-1662.8.patch.txt, HIVE-1662.9.patch.txt, HIVE-1662.D8391.1.patch, 
 HIVE-1662.D8391.2.patch, HIVE-1662.D8391.3.patch, HIVE-1662.D8391.4.patch, 
 HIVE-1662.D8391.5.patch, HIVE-1662.D8391.6.patch, HIVE-1662.D8391.7.patch


 now hive support filename virtual column. 
 if a file name filter presents in a query, hive should be able to only add 
 files which passed the filter to input paths.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7287) hive --rcfilecat command is broken on Windows

2014-06-26 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-7287:
-

Status: Patch Available  (was: Open)

+1

 hive --rcfilecat command is broken on Windows
 -

 Key: HIVE-7287
 URL: https://issues.apache.org/jira/browse/HIVE-7287
 Project: Hive
  Issue Type: Bug
  Components: CLI, Windows
Affects Versions: 0.13.0
 Environment: Windows
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Attachments: HIVE-7287.1.patch


 {noformat}
 c:\ hive --rcfilecat --file-sizes --column-sizes-pretty /tmp/all100krc
 Not a valid JAR: C:\org.apache.hadoop.hive.cli.RCFileCat
 {noformat}
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6375) Fix CTAS for parquet

2014-06-26 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044992#comment-14044992
 ] 

Szehon Ho commented on HIVE-6375:
-

[~leftylev] I believe HIVE-6938 adds support for parquet column-rename (with a 
flag).  I added a link.

 Fix CTAS for parquet
 

 Key: HIVE-6375
 URL: https://issues.apache.org/jira/browse/HIVE-6375
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Szehon Ho
Priority: Critical
  Labels: Parquet
 Fix For: 0.13.0

 Attachments: HIVE-6375.2.patch, HIVE-6375.3.patch, HIVE-6375.4.patch, 
 HIVE-6375.patch


 More details here:
 https://github.com/Parquet/parquet-mr/issues/272



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7270) SerDe Properties are not considered by show create table Command

2014-06-26 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044998#comment-14044998
 ] 

Jason Dere commented on HIVE-7270:
--

+1. Is the failure in authorization_ctas related to the patch?

 SerDe Properties are not considered by show create table Command
 

 Key: HIVE-7270
 URL: https://issues.apache.org/jira/browse/HIVE-7270
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.1
Reporter: Renil J
Assignee: Navis
Priority: Minor
 Attachments: HIVE-7270.1.patch.txt


 The HIVE table DDl generated by show create table target_table command does 
 not contain SerDe properties of the target table even though it contain 
 specific SerDe properties. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 23006: Escape control characters for explain result

2014-06-26 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23006/#review46773
---



ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java
https://reviews.apache.org/r/23006/#comment82355

what difference doesn't this change make?



ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java
https://reviews.apache.org/r/23006/#comment82357

For my understanding, why cannot we just simply replace 0x00 with a 
different character such as ' '? Why we are dealing with quotes and commas? Can 
you give an example what's transformed to what?


- Xuefu Zhang


On June 26, 2014, 9:05 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/23006/
 ---
 
 (Updated June 26, 2014, 9:05 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-7024
 https://issues.apache.org/jira/browse/HIVE-7024
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Comments for columns are now delimited by 0x00, which is binary and make git 
 refuse to make proper diff file.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 92545d8 
   ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 1149bda 
   ql/src/test/results/clientpositive/alter_partition_coltype.q.out e86cc06 
   ql/src/test/results/clientpositive/annotate_stats_filter.q.out c7d58f6 
   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 6f72964 
   ql/src/test/results/clientpositive/annotate_stats_join.q.out cc816c8 
   ql/src/test/results/clientpositive/annotate_stats_part.q.out a0b4602 
   ql/src/test/results/clientpositive/annotate_stats_select.q.out 97e9473 
   ql/src/test/results/clientpositive/annotate_stats_table.q.out bb2d18c 
   ql/src/test/results/clientpositive/annotate_stats_union.q.out 6d179b6 
   ql/src/test/results/clientpositive/auto_join_reordering_values.q.out 
 3f4f902 
   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 72640df 
   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out c660cd0 
   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 4abda32 
   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 52a3194 
   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out d807791 
   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 35e0a30 
   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out af3d9d6 
   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out 05ef5d8 
   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out e423d14 
   ql/src/test/results/clientpositive/binary_output_format.q.out 294aabb 
   ql/src/test/results/clientpositive/bucket1.q.out f3eb15c 
   ql/src/test/results/clientpositive/bucket2.q.out 9a22160 
   ql/src/test/results/clientpositive/bucket3.q.out 8fa9c7b 
   ql/src/test/results/clientpositive/bucket4.q.out 032272b 
   ql/src/test/results/clientpositive/bucket5.q.out d19fbe5 
   ql/src/test/results/clientpositive/bucket_map_join_1.q.out 8674a6c 
   ql/src/test/results/clientpositive/bucket_map_join_2.q.out 8a5984d 
   ql/src/test/results/clientpositive/bucketcontext_1.q.out 1513515 
   ql/src/test/results/clientpositive/bucketcontext_2.q.out d18a9be 
   ql/src/test/results/clientpositive/bucketcontext_3.q.out e12c155 
   ql/src/test/results/clientpositive/bucketcontext_4.q.out 77b4882 
   ql/src/test/results/clientpositive/bucketcontext_5.q.out fa1cfc5 
   ql/src/test/results/clientpositive/bucketcontext_6.q.out aac66f8 
   ql/src/test/results/clientpositive/bucketcontext_7.q.out 78c4f94 
   ql/src/test/results/clientpositive/bucketcontext_8.q.out ad7fec9 
   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 10f1af4 
   ql/src/test/results/clientpositive/bucketmapjoin10.q.out 88ecf40 
   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4ee1fa0 
   ql/src/test/results/clientpositive/bucketmapjoin12.q.out 9253f4a 
   ql/src/test/results/clientpositive/bucketmapjoin13.q.out b380fab 
   ql/src/test/results/clientpositive/bucketmapjoin2.q.out 297412f 
   ql/src/test/results/clientpositive/bucketmapjoin3.q.out 7f307a0 
   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f0f9aee 
   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 79e1c3d 
   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 76baf50 
   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 94fdbde 
   ql/src/test/results/clientpositive/bucketmapjoin9.q.out c9f4c17 
   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 751e32f 
   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 3eb70d1 
   ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 34abe4f 
   

[jira] [Commented] (HIVE-7295) FileStatus.getOwner on Windows returns name of group the user belongs to, instead of user name expected, fails many authorization related unit tests

2014-06-26 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045014#comment-14045014
 ] 

Chris Nauroth commented on HIVE-7295:
-

Hi, [~xiaobingo].  Overall, I recommend running tests as a non-admin user.  If 
you really prefer to put a workaround in the code, then I recommend limiting 
the scope of the special case.  This code won't be capable of telling the 
difference between a user named foo and a group named foo.  It's common for 
files to have greater permissions for the owner vs. the group, so if an 
attacker named bar somehow manages to sneak foo into his group memberships, 
then it could cause elevation of privileges.  (This is probably unlikely, but I 
wanted to point it out.)

A couple of suggestions:
# Only trigger the special case if running on Windows and the {{FileSystem}} 
represents a local file system.  This Administrators special case does not 
apply to other file systems (HDFS or S3 for example).
# Only allow it for Administrators, not any group.  This behavior of setting 
ownership of new files to Administrators is a special case for members of the 
Administrators group only.

BTW, there is also a Windows policy setting that can be changed so that it 
won't automatically set ownership of new files to Administrators.  This might 
be another option if you prefer to keep running tests as an admin.

 FileStatus.getOwner on Windows returns name of group the user belongs to, 
 instead of user name expected, fails many authorization related unit tests
 

 Key: HIVE-7295
 URL: https://issues.apache.org/jira/browse/HIVE-7295
 Project: Hive
  Issue Type: Bug
  Components: Authorization, HCatalog, Security, Windows
Affects Versions: 0.13.0
 Environment: Windows Server 2008 R2
Reporter: Xiaobing Zhou
Priority: Critical

 Unit test in TestHdfsAuthorizationProvider, e.g. 
 org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testTableOps. 
 fails to run.
 Running org.apache.hcatalog.security.TestHdfsAuthorizationProvider
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 15.799 sec 
  FAILURE! - in org.apache.hcatalog.security.TestHdfsAuthorizationProvider
 testTableOps(org.apache.hcatalog.security.TestHdfsAuthorizationProvider)  
 Time elapsed: 15.546 sec   FAILURE!
 junit.framework.AssertionFailedError: FAILED: AuthorizationException 
 org.apache.hadoop.security.AccessControlException: action WRITE not permitted 
 on path pfile:/Users/xz
 hou/hworks/workspace/hwx-hive-ws/hive/hcatalog/core/target/warehouse for user 
 xzhou expected:0 but was:4
 at junit.framework.Assert.fail(Assert.java:50)
 at junit.framework.Assert.failNotEquals(Assert.java:287)
 at junit.framework.Assert.assertEquals(Assert.java:67)
 at junit.framework.Assert.assertEquals(Assert.java:199)
 at 
 org.apache.hcatalog.security.TestHdfsAuthorizationProvider.exec(TestHdfsAuthorizationProvider.java:172)
 at 
 org.apache.hcatalog.security.TestHdfsAuthorizationProvider.testTableOps(TestHdfsAuthorizationProvider.java:307)
 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7299) Enable metadata only optimization on Tez

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045029#comment-14045029
 ] 

Hive QA commented on HIVE-7299:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652583/HIVE-7299.1.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 5669 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_script_env_var1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_script_env_var2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union4
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union6
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union7
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union8
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/603/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/603/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-603/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652583

 Enable metadata only optimization on Tez
 

 Key: HIVE-7299
 URL: https://issues.apache.org/jira/browse/HIVE-7299
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-7299.1.patch


 Enables the metadata only optimization (the one with OneNullRowInputFormat 
 not the query-result-from-stats optimizaton)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7296) big data approximate processing at a very low cost based on hive sql

2014-06-26 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045050#comment-14045050
 ] 

Xuefu Zhang commented on HIVE-7296:
---

Does this help you in any way?
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Sampling

 big data approximate processing  at a very  low cost  based on hive sql 
 

 Key: HIVE-7296
 URL: https://issues.apache.org/jira/browse/HIVE-7296
 Project: Hive
  Issue Type: New Feature
Reporter: wangmeng

 For big data analysis, we often need to do the following query and statistics:
 1.Cardinality Estimation,   count the number of different elements in the 
 collection, such as Unique Visitor ,UV)
 Now we can use hive-query:
 Select distinct(id)  from TestTable ;
 2.Frequency Estimation: estimate number of an element is repeated, such as 
 the site visits of  a user 。
 Hive query: select  count(1)  from TestTable where name=”wangmeng”
 3.Heavy Hitters, top-k elements: such as top-100 shops 
 Hive query: select count(1), name  from TestTable  group by name ;  need UDF……
 4.Range Query: for example, to find out the number of  users between 20 to 30
 Hive query : select  count(1) from TestTable where age20 and age 30
 5.Membership Query : for example, whether  the user name is already 
 registered?
 According to the implementation mechanism of hive , it  will cost too large 
 memory space and a long query time.
 However ,in many cases, we do not need very accurate results and a small 
 error can be tolerated. In such case  , we can use  approximate processing  
 to greatly improve the time and space efficiency.
 Now , based  on some theoretical analysis materials ,I want to  do some for 
 these new features so much if possible. .
 I am familiar with hive and  hadoop , and  I have implemented an efficient  
 storage format based on hive.( 
 https://github.com/sjtufighter/Data---Storage--).
 So, is there anything I can do ?  Many Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7097) The Support for REGEX Column Broken in HIVE 0.13

2014-06-26 Thread Sumit Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045065#comment-14045065
 ] 

Sumit Kumar commented on HIVE-7097:
---

[~sunrui] I hit this today and found following references useful:

# 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterColumn
# 
https://issues.apache.org/jira/secure/attachment/12618321/QuotedIdentifier.html

In short the functionality is still there but you need to set 
hive.support.quoted.identifiers to none to get the pre-0.13 behavior. I was 
able to run my query after
{code:actionscript}
hive set hive.support.quoted.identifiers=none;
{code}

My query was something like:
{code:actionscript}
hive select `(col1|col2|col3)?+.+` from testTable1;
{code}


 The Support for REGEX Column Broken in HIVE 0.13
 

 Key: HIVE-7097
 URL: https://issues.apache.org/jira/browse/HIVE-7097
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Sun Rui

 The Support for REGEX Column is OK in HIVE 0.12, but is broken in HIVE 0.13.
 For example:
 {code:sql}
 select `key.*` from src limit 1;
 {code}
 will fail in HIVE 0.13 with the following error from SemanticAnalyzer:
 {noformat}
 FAILED: SemanticException [Error 10004]: Line 1:7 Invalid table alias or 
 column reference 'key.*': (possible column names are: key, value)
 {noformat}
 This issue is related to HIVE-6037. When set 
 hive.support.quoted.identifiers=none, the issue will be gone.
 I am not sure the configuration was intended to break regex column. But at 
 least the documentation needs to be updated: 
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-REGEXColumnSpecification
 I would argue backward compatibility is more important.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6037) Synchronize HiveConf with hive-default.xml.template and support show conf

2014-06-26 Thread Sumit Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045070#comment-14045070
 ] 

Sumit Kumar commented on HIVE-6037:
---

[~leftylev] Here is the JIRA that decided to remove hive-default.xml and import 
all configuration changes in HiveConf itself.

 Synchronize HiveConf with hive-default.xml.template and support show conf
 -

 Key: HIVE-6037
 URL: https://issues.apache.org/jira/browse/HIVE-6037
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.14.0

 Attachments: CHIVE-6037.3.patch.txt, HIVE-6037-0.13.0, 
 HIVE-6037.1.patch.txt, HIVE-6037.10.patch.txt, HIVE-6037.11.patch.txt, 
 HIVE-6037.12.patch.txt, HIVE-6037.14.patch.txt, HIVE-6037.15.patch.txt, 
 HIVE-6037.16.patch.txt, HIVE-6037.17.patch, HIVE-6037.2.patch.txt, 
 HIVE-6037.4.patch.txt, HIVE-6037.5.patch.txt, HIVE-6037.6.patch.txt, 
 HIVE-6037.7.patch.txt, HIVE-6037.8.patch.txt, HIVE-6037.9.patch.txt, 
 HIVE-6037.patch


 see HIVE-5879



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7097) The Support for REGEX Column Broken in HIVE 0.13

2014-06-26 Thread Sumit Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045084#comment-14045084
 ] 

Sumit Kumar commented on HIVE-7097:
---

Basically this doesn't seem to be an issue but it would help if we clarify this 
in [Select 
documentation|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select]
 as well . 

 The Support for REGEX Column Broken in HIVE 0.13
 

 Key: HIVE-7097
 URL: https://issues.apache.org/jira/browse/HIVE-7097
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Sun Rui

 The Support for REGEX Column is OK in HIVE 0.12, but is broken in HIVE 0.13.
 For example:
 {code:sql}
 select `key.*` from src limit 1;
 {code}
 will fail in HIVE 0.13 with the following error from SemanticAnalyzer:
 {noformat}
 FAILED: SemanticException [Error 10004]: Line 1:7 Invalid table alias or 
 column reference 'key.*': (possible column names are: key, value)
 {noformat}
 This issue is related to HIVE-6037. When set 
 hive.support.quoted.identifiers=none, the issue will be gone.
 I am not sure the configuration was intended to break regex column. But at 
 least the documentation needs to be updated: 
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-REGEXColumnSpecification
 I would argue backward compatibility is more important.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6694) Beeline should provide a way to execute shell command as Hive CLI does

2014-06-26 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045116#comment-14045116
 ] 

Xuefu Zhang commented on HIVE-6694:
---

[~brocknoland] Just make sure that I understand correctly. Did you mean I need 
to check if e.getLocalizedMessage() is null? Would you like to see something 
like the following:
{code}
String msg = e.getLocalizedMessage();
if (msg == null)
  msg = unknown cause
beeLine.error(Failed to execute Shell command:  + msg);
{code}

 Beeline should provide a way to execute shell command as Hive CLI does
 --

 Key: HIVE-6694
 URL: https://issues.apache.org/jira/browse/HIVE-6694
 Project: Hive
  Issue Type: Improvement
  Components: CLI, Clients
Affects Versions: 0.11.0, 0.12.0, 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.14.0

 Attachments: HIVE-6694.1.patch, HIVE-6694.1.patch, HIVE-6694.2.patch, 
 HIVE-6694.3.patch, HIVE-6694.4.patch, HIVE-6694.patch


 Hive CLI allows a user to execute a shell command using ! notation. For 
 instance, !cat myfile.txt. Being able to execute shell command may be 
 important for some users. As a replacement, however, Beeline provides no such 
 capability, possibly because ! notation is reserved for SQLLine commands. 
 It's possible to provide this using a slightly syntactic variation such as 
 !sh cat myfilie.txt.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7211) Throws exception if the name of conf var starts with hive. does not exists in HiveConf

2014-06-26 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045117#comment-14045117
 ] 

Daniel Dai commented on HIVE-7211:
--

This patch move several public constant to HiveConf such as 
RCFile.COLUMN_NUMBER_CONF_STR. This might break downstream projects. Shall we 
retain it for several release but mark it deprecated?

 Throws exception if the name of conf var starts with hive. does not exists 
 in HiveConf
 

 Key: HIVE-7211
 URL: https://issues.apache.org/jira/browse/HIVE-7211
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.14.0

 Attachments: HIVE-7211.1.patch.txt, HIVE-7211.2.patch.txt, 
 HIVE-7211.3.patch.txt, HIVE-7211.4.patch.txt


 Some typos in configurations are very hard to find.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)

2014-06-26 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045200#comment-14045200
 ] 

Szehon Ho commented on HIVE-7220:
-

It's weird that dynpart_sort_optimization failed twice here, but I couldn't 
reproduce it running locally.  Also looked through the test logs and didnt see 
any related stacktraces.

 Empty dir in external table causes issue (root_dir_external_table.q failure)
 

 Key: HIVE-7220
 URL: https://issues.apache.org/jira/browse/HIVE-7220
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, 
 HIVE-7220.patch


 While looking at root_dir_external_table.q failure, which is doing a query on 
 an external table located at root ('/'), I noticed that latest Hadoop2 
 CombineFileInputFormat returns split representing empty directories (like 
 '/Users'), which leads to failure in Hive's CombineFileRecordReader as it 
 tries to open the directory for processing.
 Tried with an external table in a normal HDFS directory, and it also returns 
 the same error.  Looks like a real bug.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045239#comment-14045239
 ] 

Hive QA commented on HIVE-7090:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652594/HIVE-7090.5.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5670 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/604/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/604/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-604/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652594

 Support session-level temporary tables in Hive
 --

 Key: HIVE-7090
 URL: https://issues.apache.org/jira/browse/HIVE-7090
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Gunther Hagleitner
Assignee: Jason Dere
 Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch, HIVE-7090.3.patch, 
 HIVE-7090.4.patch, HIVE-7090.5.patch


 It's common to see sql scripts that create some temporary table as an 
 intermediate result, run some additional queries against it and then clean up 
 at the end.
 We should support temporary tables properly, meaning automatically manage the 
 life cycle and make sure the visibility is restricted to the creating 
 connection/session. Without these it's common to see left over tables in 
 meta-store or weird errors with clashing tmp table names.
 Proposed syntax:
 CREATE TEMPORARY TABLE 
 CTAS, CTL, INSERT INTO, should all be supported as usual.
 Knowing that a user wants a temp table can enable us to further optimize 
 access to it. E.g.: temp tables should be kept in memory where possible, 
 compactions and merging table files aren't required, ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5976) Decouple input formats from STORED as keywords

2014-06-26 Thread David Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045285#comment-14045285
 ] 

David Chen commented on HIVE-5976:
--

[~brocknoland] No problem! Would you be fine if I picked this up and fix the 
test failures?

 Decouple input formats from STORED as keywords
 --

 Key: HIVE-5976
 URL: https://issues.apache.org/jira/browse/HIVE-5976
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-5976.2.patch, HIVE-5976.patch, HIVE-5976.patch, 
 HIVE-5976.patch, HIVE-5976.patch


 As noted in HIVE-5783, we hard code the input formats mapped to keywords. 
 It'd be nice if there was a registration system so we didn't need to do that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6637) UDF in_file() doesn't take CHAR or VARCHAR as input

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045288#comment-14045288
 ] 

Hive QA commented on HIVE-6637:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652648/HIVE-6637.3.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5669 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/605/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/605/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-605/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652648

 UDF in_file() doesn't take CHAR or VARCHAR as input
 ---

 Key: HIVE-6637
 URL: https://issues.apache.org/jira/browse/HIVE-6637
 Project: Hive
  Issue Type: Bug
  Components: Types, UDF
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ashish Kumar Singh
 Attachments: HIVE-6637.1.patch, HIVE-6637.2.patch, HIVE-6637.3.patch


 {code}
 hive desc alter_varchar_1;
 key   string  None
 value varchar(3)  None
 key2  int None
 value2varchar(10) None
 hive select in_file(value, value2) from alter_varchar_1;
 FAILED: SemanticException [Error 10016]: Line 1:15 Argument type mismatch 
 'value': The 1st argument of function IN_FILE must be a string but 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableHiveVarcharObjectInspector@10f1f34a
  was given.
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6637) UDF in_file() doesn't take CHAR or VARCHAR as input

2014-06-26 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045292#comment-14045292
 ] 

Xuefu Zhang commented on HIVE-6637:
---

I left some minor comment on RB.

 UDF in_file() doesn't take CHAR or VARCHAR as input
 ---

 Key: HIVE-6637
 URL: https://issues.apache.org/jira/browse/HIVE-6637
 Project: Hive
  Issue Type: Bug
  Components: Types, UDF
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ashish Kumar Singh
 Attachments: HIVE-6637.1.patch, HIVE-6637.2.patch, HIVE-6637.3.patch


 {code}
 hive desc alter_varchar_1;
 key   string  None
 value varchar(3)  None
 key2  int None
 value2varchar(10) None
 hive select in_file(value, value2) from alter_varchar_1;
 FAILED: SemanticException [Error 10016]: Line 1:15 Argument type mismatch 
 'value': The 1st argument of function IN_FILE must be a string but 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableHiveVarcharObjectInspector@10f1f34a
  was given.
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches

2014-06-26 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7105:
---

Status: Open  (was: Patch Available)

 Enable ReduceRecordProcessor to generate VectorizedRowBatches
 -

 Key: HIVE-7105
 URL: https://issues.apache.org/jira/browse/HIVE-7105
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Reporter: Rajesh Balamohan
Assignee: Gopal V
 Fix For: 0.14.0

 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch


 Currently, ReduceRecordProcessor sends one key,value pair at a time to its 
 operator pipeline.  It would be beneficial to send VectorizedRowBatch to 
 downstream operators. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches

2014-06-26 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7105:
---

Status: Patch Available  (was: Open)

Re-run tests, please.

 Enable ReduceRecordProcessor to generate VectorizedRowBatches
 -

 Key: HIVE-7105
 URL: https://issues.apache.org/jira/browse/HIVE-7105
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Reporter: Rajesh Balamohan
Assignee: Gopal V
 Fix For: 0.14.0

 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch


 Currently, ReduceRecordProcessor sends one key,value pair at a time to its 
 operator pipeline.  It would be beneficial to send VectorizedRowBatch to 
 downstream operators. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7232) VectorReduceSink is emitting incorrect JOIN keys

2014-06-26 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-7232:
--

  Resolution: Fixed
Release Note: VectorReduceSink is emitting incorrect JOIN keys (Navis, via 
Gopal V)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk, thanks!

 VectorReduceSink is emitting incorrect JOIN keys
 

 Key: HIVE-7232
 URL: https://issues.apache.org/jira/browse/HIVE-7232
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-7232-extra-logging.patch, HIVE-7232.1.patch.txt, 
 HIVE-7232.2.patch.txt, q5.explain.txt, q5.sql


 After HIVE-7121, tpc-h query5 has resulted in incorrect results.
 Thanks to [~navis], it has been tracked down to the auto-parallel settings 
 which were initialized for ReduceSinkOperator, but not for 
 VectorReduceSinkOperator. The vector version inherits, but doesn't call 
 super.initializeOp() or set up the variable correctly from ReduceSinkDesc.
 The query is tpc-h query5, with extra NULL checks just to be sure.
 {code}
 ELECT n_name,
sum(l_extendedprice * (1 - l_discount)) AS revenue
 FROM customer,
  orders,
  lineitem,
  supplier,
  nation,
  region
 WHERE c_custkey = o_custkey
   AND l_orderkey = o_orderkey
   AND l_suppkey = s_suppkey
   AND c_nationkey = s_nationkey
   AND s_nationkey = n_nationkey
   AND n_regionkey = r_regionkey
   AND r_name = 'ASIA'
   AND o_orderdate = '1994-01-01'
   AND o_orderdate  '1995-01-01'
   and l_orderkey is not null
   and c_custkey is not null
   and l_suppkey is not null
   and c_nationkey is not null
   and s_nationkey is not null
   and n_regionkey is not null
 GROUP BY n_name
 ORDER BY revenue DESC;
 {code}
 The reducer which has the issue has the following plan
 {code}
 Reducer 3
 Reduce Operator Tree:
   Join Operator
 condition map:
  Inner Join 0 to 1
 condition expressions:
   0 {KEY.reducesinkkey0} {VALUE._col2}
   1 {VALUE._col0} {KEY.reducesinkkey0} {VALUE._col3}
 outputColumnNames: _col0, _col3, _col10, _col11, _col14
 Statistics: Num rows: 18344 Data size: 95229140992 Basic 
 stats: COMPLETE Column stats: NONE
 Reduce Output Operator
   key expressions: _col10 (type: int)
   sort order: +
   Map-reduce partition columns: _col10 (type: int)
   Statistics: Num rows: 18344 Data size: 95229140992 
 Basic stats: COMPLETE Column stats: NONE
   value expressions: _col0 (type: int), _col3 (type: int), 
 _col11 (type: int), _col14 (type: string)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7232) VectorReduceSink is emitting incorrect JOIN keys

2014-06-26 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045308#comment-14045308
 ] 

Gopal V commented on HIVE-7232:
---

Committed to trunk, thanks!

 VectorReduceSink is emitting incorrect JOIN keys
 

 Key: HIVE-7232
 URL: https://issues.apache.org/jira/browse/HIVE-7232
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-7232-extra-logging.patch, HIVE-7232.1.patch.txt, 
 HIVE-7232.2.patch.txt, q5.explain.txt, q5.sql


 After HIVE-7121, tpc-h query5 has resulted in incorrect results.
 Thanks to [~navis], it has been tracked down to the auto-parallel settings 
 which were initialized for ReduceSinkOperator, but not for 
 VectorReduceSinkOperator. The vector version inherits, but doesn't call 
 super.initializeOp() or set up the variable correctly from ReduceSinkDesc.
 The query is tpc-h query5, with extra NULL checks just to be sure.
 {code}
 ELECT n_name,
sum(l_extendedprice * (1 - l_discount)) AS revenue
 FROM customer,
  orders,
  lineitem,
  supplier,
  nation,
  region
 WHERE c_custkey = o_custkey
   AND l_orderkey = o_orderkey
   AND l_suppkey = s_suppkey
   AND c_nationkey = s_nationkey
   AND s_nationkey = n_nationkey
   AND n_regionkey = r_regionkey
   AND r_name = 'ASIA'
   AND o_orderdate = '1994-01-01'
   AND o_orderdate  '1995-01-01'
   and l_orderkey is not null
   and c_custkey is not null
   and l_suppkey is not null
   and c_nationkey is not null
   and s_nationkey is not null
   and n_regionkey is not null
 GROUP BY n_name
 ORDER BY revenue DESC;
 {code}
 The reducer which has the issue has the following plan
 {code}
 Reducer 3
 Reduce Operator Tree:
   Join Operator
 condition map:
  Inner Join 0 to 1
 condition expressions:
   0 {KEY.reducesinkkey0} {VALUE._col2}
   1 {VALUE._col0} {KEY.reducesinkkey0} {VALUE._col3}
 outputColumnNames: _col0, _col3, _col10, _col11, _col14
 Statistics: Num rows: 18344 Data size: 95229140992 Basic 
 stats: COMPLETE Column stats: NONE
 Reduce Output Operator
   key expressions: _col10 (type: int)
   sort order: +
   Map-reduce partition columns: _col10 (type: int)
   Statistics: Num rows: 18344 Data size: 95229140992 
 Basic stats: COMPLETE Column stats: NONE
   value expressions: _col0 (type: int), _col3 (type: int), 
 _col11 (type: int), _col14 (type: string)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5976) Decouple input formats from STORED as keywords

2014-06-26 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045339#comment-14045339
 ] 

Brock Noland commented on HIVE-5976:


I'd be delighted! 

 Decouple input formats from STORED as keywords
 --

 Key: HIVE-5976
 URL: https://issues.apache.org/jira/browse/HIVE-5976
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-5976.2.patch, HIVE-5976.patch, HIVE-5976.patch, 
 HIVE-5976.patch, HIVE-5976.patch


 As noted in HIVE-5783, we hard code the input formats mapped to keywords. 
 It'd be nice if there was a registration system so we didn't need to do that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HIVE-5570) Handle virtual columns and schema evolution in vector code path

2014-06-26 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-5570:
--

Assignee: Matt McCline  (was: Jitendra Nath Pandey)

 Handle virtual columns and schema evolution in vector code path
 ---

 Key: HIVE-5570
 URL: https://issues.apache.org/jira/browse/HIVE-5570
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Matt McCline





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5570) Handle virtual columns and schema evolution in vector code path

2014-06-26 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045351#comment-14045351
 ] 

Matt McCline commented on HIVE-5570:



Lack of virtual column support in vector code also shows up in the PTF queries, 
too.

Change the input table in ptf.q to be an ORC format (which enables 
vectorization in first Map phase) and fix Vectorizer.java to allow virtual 
columns in MapWork, then the queries fail because 2 virtual columns are not 
handled in VectorMapOperator.

 Handle virtual columns and schema evolution in vector code path
 ---

 Key: HIVE-5570
 URL: https://issues.apache.org/jira/browse/HIVE-5570
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Matt McCline





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6470) Indent of explain result is not consistent

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6470:


Status: Patch Available  (was: Open)

 Indent of explain result is not consistent
 --

 Key: HIVE-6470
 URL: https://issues.apache.org/jira/browse/HIVE-6470
 Project: Hive
  Issue Type: Improvement
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-6470.1.patch.txt, HIVE-6470.2.patch.txt


 NO PRECOMMIT TESTS
 You can see it any explain result. For example in auto_join0.q
 {noformat}
 Map Reduce
   Map Operator Tree:
   TableScan
 {noformat}
 or
 {noformat}
 Map Join Operator
   condition map:
Inner Join 0 to 1
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6470) Indent of explain result is not consistent

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6470:


Attachment: HIVE-6470.2.patch.txt

 Indent of explain result is not consistent
 --

 Key: HIVE-6470
 URL: https://issues.apache.org/jira/browse/HIVE-6470
 Project: Hive
  Issue Type: Improvement
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-6470.1.patch.txt, HIVE-6470.2.patch.txt


 NO PRECOMMIT TESTS
 You can see it any explain result. For example in auto_join0.q
 {noformat}
 Map Reduce
   Map Operator Tree:
   TableScan
 {noformat}
 or
 {noformat}
 Map Join Operator
   condition map:
Inner Join 0 to 1
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6470) Indent of explain result is not consistent

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6470:


Attachment: (was: HIVE-6470.2.patch.txt)

 Indent of explain result is not consistent
 --

 Key: HIVE-6470
 URL: https://issues.apache.org/jira/browse/HIVE-6470
 Project: Hive
  Issue Type: Improvement
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-6470.1.patch.txt


 NO PRECOMMIT TESTS
 You can see it any explain result. For example in auto_join0.q
 {noformat}
 Map Reduce
   Map Operator Tree:
   TableScan
 {noformat}
 or
 {noformat}
 Map Join Operator
   condition map:
Inner Join 0 to 1
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6470) Indent of explain result is not consistent

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6470:


Status: Open  (was: Patch Available)

 Indent of explain result is not consistent
 --

 Key: HIVE-6470
 URL: https://issues.apache.org/jira/browse/HIVE-6470
 Project: Hive
  Issue Type: Improvement
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-6470.1.patch.txt


 NO PRECOMMIT TESTS
 You can see it any explain result. For example in auto_join0.q
 {noformat}
 Map Reduce
   Map Operator Tree:
   TableScan
 {noformat}
 or
 {noformat}
 Map Join Operator
   condition map:
Inner Join 0 to 1
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7211) Throws exception if the name of conf var starts with hive. does not exists in HiveConf

2014-06-26 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045362#comment-14045362
 ] 

Navis commented on HIVE-7211:
-

Yes, I regret that. Should make a follow-up issue.

 Throws exception if the name of conf var starts with hive. does not exists 
 in HiveConf
 

 Key: HIVE-7211
 URL: https://issues.apache.org/jira/browse/HIVE-7211
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.14.0

 Attachments: HIVE-7211.1.patch.txt, HIVE-7211.2.patch.txt, 
 HIVE-7211.3.patch.txt, HIVE-7211.4.patch.txt


 Some typos in configurations are very hard to find.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7288) Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs

2014-06-26 Thread shanyu zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045365#comment-14045365
 ] 

shanyu zhao commented on HIVE-7288:
---

From reading the description and look at hadoop streaming document, I think we 
need to add the following parameters to mapreduce/streaming endpoint:
{code}
libjars
archives
inputformat
outputformat
{code}

FYI this is the current list of parameters for streaming endpoint:
{code}
@FormParam(output) String output,
@FormParam(mapper) String mapper,
@FormParam(reducer) String reducer,
@FormParam(combiner) String combiner,
@FormParam(file) ListString fileList,
@FormParam(files) String files,
@FormParam(define) ListString defines,
@FormParam(cmdenv) ListString cmdenvs,
@FormParam(arg) ListString args,
@FormParam(statusdir) String statusdir,
@FormParam(callback) String callback,
@FormParam(enablelog) boolean enablelog
{code}

 Enable support for -libjars and -archives in WebHcat for Streaming MapReduce 
 jobs
 -

 Key: HIVE-7288
 URL: https://issues.apache.org/jira/browse/HIVE-7288
 Project: Hive
  Issue Type: New Feature
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1
 Environment: HDInsight deploying HDP 2.1;  Also HDP 2.1 on Windows 
Reporter: Azim Uddin
Assignee: shanyu zhao

 Issue:
 ==
 Due to lack of parameters (or support for) equivalent of '-libjars' and 
 '-archives' in WebHcat REST API, we cannot use an external Java Jars or 
 Archive files with a Streaming MapReduce job, when the job is submitted via 
 WebHcat/templeton. 
 I am citing a few use cases here, but there can be plenty of scenarios like 
 this-
 #1 
 (for -archives):In order to use R with a hadoop distribution like HDInsight 
 or HDP on Windows, we could package the R directory up in a zip file and 
 rename it to r.jar and put it into HDFS or WASB. We can then do 
 something like this from hadoop command line (ignore the wasb syntax, same 
 command can be run with hdfs) - 
 hadoop jar %HADOOP_HOME%\lib\hadoop-streaming.jar -archives 
 wasb:///example/jars/r.jar -files 
 wasb:///example/apps/mapper.r,wasb:///example/apps/reducer.r -mapper 
 ./r.jar/bin/Rscript.exe mapper.r -reducer ./r.jar/bin/Rscript.exe 
 reducer.r -input /example/data/gutenberg -output /probe/r/wordcount
 This works from hadoop command line, but due to lack of support for 
 '-archives' parameter in WebHcat, we can't submit the same Streaming MR job 
 via WebHcat.
 #2 (for -libjars):
 Consider a scenario where a user would like to use a custom inputFormat with 
 a Streaming MapReduce job and wrote his own custom InputFormat JAR. From a 
 hadoop command line we can do something like this - 
 hadoop jar /path/to/hadoop-streaming.jar \
 -libjars /path/to/custom-formats.jar \
 -D map.output.key.field.separator=, \
 -D mapred.text.key.partitioner.options=-k1,1 \
 -input my_data/ \
 -output my_output/ \
 -outputformat test.example.outputformat.DateFieldMultipleOutputFormat 
 \
 -mapper my_mapper.py \
 -reducer my_reducer.py \
 But due to lack of support for '-libjars' parameter for streaming MapReduce 
 job in WebHcat, we can't submit the above streaming MR job (that uses a 
 custom Java JAR) via WebHcat.
 Impact:
 
 We think, being able to submit jobs remotely is a vital feature for hadoop to 
 be enterprise-ready and WebHcat plays an important role there. Streaming 
 MapReduce job is also very important for interoperability. So, it would be 
 very useful to keep WebHcat on par with hadoop command line in terms of 
 streaming MR job submission capability.
 Ask:
 
 Enable parameter support for 'libjars' and 'archives' in WebHcat for Hadoop 
 streaming jobs in WebHcat.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 22772: HIVE-6637: UDF in_file() doesn't take CHAR or VARCHAR as input

2014-06-26 Thread Ashish Singh


 On June 26, 2014, 6:10 p.m., Xuefu Zhang wrote:
  data/files/in_file.dat, line 1
  https://reviews.apache.org/r/22772/diff/6/?file=618531#file618531line1
 
  Could we reuse existing file instead of creating new ones?

I could not find an existing data file with the data I need. test2.dat has the 
data but then to use it I will have to have complex DDL and load statements 
leading to multiple MR stages. Let me know if I am missing something here.


- Ashish


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22772/#review46768
---


On June 26, 2014, 5:12 p.m., Ashish Singh wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/22772/
 ---
 
 (Updated June 26, 2014, 5:12 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6637
 https://issues.apache.org/jira/browse/HIVE-6637
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-6637: UDF in_file() doesn't take CHAR or VARCHAR as input
 
 
 Diffs
 -
 
   data/files/in_file.dat PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInFile.java 
 ea52537d0b85191f0b633a29aa3f7ddb556c288d 
   ql/src/test/queries/clientpositive/udf_in_file.q 
 9d9efe8e23d6e73429ee5cd2c8470359ba2b3498 
   ql/src/test/results/clientpositive/udf_in_file.q.out 
 b63143760d80f3f6a8ba0a23c0d87e8bb86fce66 
 
 Diff: https://reviews.apache.org/r/22772/diff/
 
 
 Testing
 ---
 
 Tested with qtest.
 
 
 Thanks,
 
 Ashish Singh
 




[jira] [Commented] (HIVE-5570) Handle virtual columns and schema evolution in vector code path

2014-06-26 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045381#comment-14045381
 ] 

Matt McCline commented on HIVE-5570:



Virtual columns in the Partition (i.e. Map operator) for PTF:
  BLOCKOFFSET (BLOCK__OFFSET__INSIDE__FILE)
  FILENAME  (INPUT__FILE__NAME)

First PTF query from ptf.q is:
select 
  p_mfgr, 
  p_name, 
  p_size,
  rank() 
over (partition by p_mfgr order by p_name) as r,
  dense_rank() 
over (partition by p_mfgr order by p_name) as dr,
  sum(p_retailprice) 
over (partition by p_mfgr order by p_name rows between unbounded preceding 
and current row) as s1
from noop(on partorc 
  partition by p_mfgr
  order by p_name
  );


 Handle virtual columns and schema evolution in vector code path
 ---

 Key: HIVE-5570
 URL: https://issues.apache.org/jira/browse/HIVE-5570
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Matt McCline





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7159) For inner joins push a 'is not null predicate' to the join sources for every non nullSafe join condition

2014-06-26 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045388#comment-14045388
 ] 

Navis commented on HIVE-7159:
-

Reviewing this, I found some predicates are removed(replaced?) from plan. For 
example in auto_join29.q, it was like this
{noformat}
predicate: ((key  10) and (key  10)) (type: boolean)
{noformat}
But now it became,
{noformat}
predicate: ((key  10) and key is not null) (type: boolean)
{noformat}
Similar things can be found in 
auto_join12.q,auto_join14.q,auto_join16.q,auto_join27.q,tez/filter_join_breaktask.q,
 etc. And auto_join_without_localtask.q is changed a lot in plan. 

I'm not sure it's intended or not. [~rhbutani], Could you check this? 

 For inner joins push a 'is not null predicate' to the join sources for every 
 non nullSafe join condition
 

 Key: HIVE-7159
 URL: https://issues.apache.org/jira/browse/HIVE-7159
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.14.0

 Attachments: HIVE-7159.1.patch, HIVE-7159.10.patch, 
 HIVE-7159.11.patch, HIVE-7159.2.patch, HIVE-7159.3.patch, HIVE-7159.4.patch, 
 HIVE-7159.5.patch, HIVE-7159.6.patch, HIVE-7159.7.patch, HIVE-7159.8.patch, 
 HIVE-7159.9.patch, HIVE-7159.addendum.patch


 A join B on A.x = B.y
 can be transformed to
 (A where x is not null) join (B where y is not null) on A.x = B.y
 Apart from avoiding shuffling null keyed rows it also avoids issues with 
 reduce-side skew when there are a lot of null values in the data.
 Thanks to [~gopalv] for the analysis and coming up with the solution.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7232) VectorReduceSink is emitting incorrect JOIN keys

2014-06-26 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045391#comment-14045391
 ] 

Navis commented on HIVE-7232:
-

The result of tez_join_hash.q seemed not updated after HIVE-7159. I'll do that.

 VectorReduceSink is emitting incorrect JOIN keys
 

 Key: HIVE-7232
 URL: https://issues.apache.org/jira/browse/HIVE-7232
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-7232-extra-logging.patch, HIVE-7232.1.patch.txt, 
 HIVE-7232.2.patch.txt, q5.explain.txt, q5.sql


 After HIVE-7121, tpc-h query5 has resulted in incorrect results.
 Thanks to [~navis], it has been tracked down to the auto-parallel settings 
 which were initialized for ReduceSinkOperator, but not for 
 VectorReduceSinkOperator. The vector version inherits, but doesn't call 
 super.initializeOp() or set up the variable correctly from ReduceSinkDesc.
 The query is tpc-h query5, with extra NULL checks just to be sure.
 {code}
 ELECT n_name,
sum(l_extendedprice * (1 - l_discount)) AS revenue
 FROM customer,
  orders,
  lineitem,
  supplier,
  nation,
  region
 WHERE c_custkey = o_custkey
   AND l_orderkey = o_orderkey
   AND l_suppkey = s_suppkey
   AND c_nationkey = s_nationkey
   AND s_nationkey = n_nationkey
   AND n_regionkey = r_regionkey
   AND r_name = 'ASIA'
   AND o_orderdate = '1994-01-01'
   AND o_orderdate  '1995-01-01'
   and l_orderkey is not null
   and c_custkey is not null
   and l_suppkey is not null
   and c_nationkey is not null
   and s_nationkey is not null
   and n_regionkey is not null
 GROUP BY n_name
 ORDER BY revenue DESC;
 {code}
 The reducer which has the issue has the following plan
 {code}
 Reducer 3
 Reduce Operator Tree:
   Join Operator
 condition map:
  Inner Join 0 to 1
 condition expressions:
   0 {KEY.reducesinkkey0} {VALUE._col2}
   1 {VALUE._col0} {KEY.reducesinkkey0} {VALUE._col3}
 outputColumnNames: _col0, _col3, _col10, _col11, _col14
 Statistics: Num rows: 18344 Data size: 95229140992 Basic 
 stats: COMPLETE Column stats: NONE
 Reduce Output Operator
   key expressions: _col10 (type: int)
   sort order: +
   Map-reduce partition columns: _col10 (type: int)
   Statistics: Num rows: 18344 Data size: 95229140992 
 Basic stats: COMPLETE Column stats: NONE
   value expressions: _col0 (type: int), _col3 (type: int), 
 _col11 (type: int), _col14 (type: string)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6470) Indent of explain result is not consistent

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6470:


Status: Patch Available  (was: Open)

 Indent of explain result is not consistent
 --

 Key: HIVE-6470
 URL: https://issues.apache.org/jira/browse/HIVE-6470
 Project: Hive
  Issue Type: Improvement
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-6470.1.patch.txt, HIVE-6470.2.patch.txt


 NO PRECOMMIT TESTS
 You can see it any explain result. For example in auto_join0.q
 {noformat}
 Map Reduce
   Map Operator Tree:
   TableScan
 {noformat}
 or
 {noformat}
 Map Join Operator
   condition map:
Inner Join 0 to 1
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7301) Restore constants moved to HiveConf by HIVE-7211

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7301:


Description: 
NO PRECOMMIT TESTS

HIVE-7211 moved RCFile related constants to HiveConf. But for the backward 
compatibility, restore those as was.

  was:HIVE-7211 moved RCFile related constants to HiveConf. But for the 
backward compatibility, restore those as was.


 Restore constants moved to HiveConf by HIVE-7211
 

 Key: HIVE-7301
 URL: https://issues.apache.org/jira/browse/HIVE-7301
 Project: Hive
  Issue Type: Task
  Components: Configuration
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-7301.1.patch.txt


 NO PRECOMMIT TESTS
 HIVE-7211 moved RCFile related constants to HiveConf. But for the backward 
 compatibility, restore those as was.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7301) Restore constants moved to HiveConf by HIVE-7211

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7301:


Status: Patch Available  (was: Open)

 Restore constants moved to HiveConf by HIVE-7211
 

 Key: HIVE-7301
 URL: https://issues.apache.org/jira/browse/HIVE-7301
 Project: Hive
  Issue Type: Task
  Components: Configuration
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-7301.1.patch.txt


 NO PRECOMMIT TESTS
 HIVE-7211 moved RCFile related constants to HiveConf. But for the backward 
 compatibility, restore those as was.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7301) Restore constants moved to HiveConf by HIVE-7211

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7301:


Attachment: HIVE-7301.1.patch.txt

 Restore constants moved to HiveConf by HIVE-7211
 

 Key: HIVE-7301
 URL: https://issues.apache.org/jira/browse/HIVE-7301
 Project: Hive
  Issue Type: Task
  Components: Configuration
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-7301.1.patch.txt


 HIVE-7211 moved RCFile related constants to HiveConf. But for the backward 
 compatibility, restore those as was.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7159) For inner joins push a 'is not null predicate' to the join sources for every non nullSafe join condition

2014-06-26 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045411#comment-14045411
 ] 

Harish Butani commented on HIVE-7159:
-

Yes [~navis] you are right. Good catch. Sorry missed this, the diff was huge, 
this one unfortunately slipped through.
The reason for the regression is that PredicateTransitivePropagate looks at the 
FilterOperator below the ReduceSink. 
SemanticAnalyzer::genNotNullFilterForJoinSourcePlan was stacking another 
FilterOp for the not null check, so only that predicate was being applied 
transitively by PredicateTransitivePropagate. The fix is to add the following 
in SemanticAly line 2465
{code}
if ( input instanceof FilterOperator ) {
  FilterOperator f = (FilterOperator) input;
  ListExprNodeDesc preds = new ArrayListExprNodeDesc();
  preds.add(f.getConf().getPredicate());
  preds.add(filterPred);
  f.getConf().setPredicate(ExprNodeDescUtils.mergePredicates(preds));
  
  return input;
}
{code}

Tested auto_join29.q with this change, predicate now contains 'key  10'

Will file a jira and upload a patch tomorrow.

 For inner joins push a 'is not null predicate' to the join sources for every 
 non nullSafe join condition
 

 Key: HIVE-7159
 URL: https://issues.apache.org/jira/browse/HIVE-7159
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.14.0

 Attachments: HIVE-7159.1.patch, HIVE-7159.10.patch, 
 HIVE-7159.11.patch, HIVE-7159.2.patch, HIVE-7159.3.patch, HIVE-7159.4.patch, 
 HIVE-7159.5.patch, HIVE-7159.6.patch, HIVE-7159.7.patch, HIVE-7159.8.patch, 
 HIVE-7159.9.patch, HIVE-7159.addendum.patch


 A join B on A.x = B.y
 can be transformed to
 (A where x is not null) join (B where y is not null) on A.x = B.y
 Apart from avoiding shuffling null keyed rows it also avoids issues with 
 reduce-side skew when there are a lot of null values in the data.
 Thanks to [~gopalv] for the analysis and coming up with the solution.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)

2014-06-26 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045423#comment-14045423
 ] 

Navis commented on HIVE-7220:
-

Looks good. As the last one, isValidSplit() and getDirIndices() could be merged 
into single method? It could reduce FS access/call.

 Empty dir in external table causes issue (root_dir_external_table.q failure)
 

 Key: HIVE-7220
 URL: https://issues.apache.org/jira/browse/HIVE-7220
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, 
 HIVE-7220.patch


 While looking at root_dir_external_table.q failure, which is doing a query on 
 an external table located at root ('/'), I noticed that latest Hadoop2 
 CombineFileInputFormat returns split representing empty directories (like 
 '/Users'), which leads to failure in Hive's CombineFileRecordReader as it 
 tries to open the directory for processing.
 Tried with an external table in a normal HDFS directory, and it also returns 
 the same error.  Looks like a real bug.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)

2014-06-26 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045424#comment-14045424
 ] 

Gopal V commented on HIVE-7220:
---

Is the de-dup of locations only to work around the FileInputSplit in-mem + disk 
location changes? (i.e 3 in-mem locations and 3 on-disk locations having 
duplication?).

 Empty dir in external table causes issue (root_dir_external_table.q failure)
 

 Key: HIVE-7220
 URL: https://issues.apache.org/jira/browse/HIVE-7220
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, 
 HIVE-7220.patch


 While looking at root_dir_external_table.q failure, which is doing a query on 
 an external table located at root ('/'), I noticed that latest Hadoop2 
 CombineFileInputFormat returns split representing empty directories (like 
 '/Users'), which leads to failure in Hive's CombineFileRecordReader as it 
 tries to open the directory for processing.
 Tried with an external table in a normal HDFS directory, and it also returns 
 the same error.  Looks like a real bug.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 23006: Escape control characters for explain result

2014-06-26 Thread Navis Ryu


 On June 26, 2014, 6:51 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java, line 385
  https://reviews.apache.org/r/23006/diff/1/?file=618088#file618088line385
 
  what difference doesn't this change make?

printf(%s ,X) == print(X) + print( ), Is it wrong(I'm asking, really)?


 On June 26, 2014, 6:51 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java, line 195
  https://reviews.apache.org/r/23006/diff/1/?file=618089#file618089line195
 
  For my understanding, why cannot we just simply replace 0x00 with a 
  different character such as ' '? Why we are dealing with quotes and commas? 
  Can you give an example what's transformed to what?

With comments with spaces, just replacing 0x00 into a space would be confusing, 
IMHO. 

comment 10x00comment 20x00 will be printed like 'comment 1','comment 2',''. For 
0x000x00, null will be returned (nothing in comments).


- Navis


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23006/#review46773
---


On June 26, 2014, 9:05 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/23006/
 ---
 
 (Updated June 26, 2014, 9:05 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-7024
 https://issues.apache.org/jira/browse/HIVE-7024
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Comments for columns are now delimited by 0x00, which is binary and make git 
 refuse to make proper diff file.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 92545d8 
   ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 1149bda 
   ql/src/test/results/clientpositive/alter_partition_coltype.q.out e86cc06 
   ql/src/test/results/clientpositive/annotate_stats_filter.q.out c7d58f6 
   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 6f72964 
   ql/src/test/results/clientpositive/annotate_stats_join.q.out cc816c8 
   ql/src/test/results/clientpositive/annotate_stats_part.q.out a0b4602 
   ql/src/test/results/clientpositive/annotate_stats_select.q.out 97e9473 
   ql/src/test/results/clientpositive/annotate_stats_table.q.out bb2d18c 
   ql/src/test/results/clientpositive/annotate_stats_union.q.out 6d179b6 
   ql/src/test/results/clientpositive/auto_join_reordering_values.q.out 
 3f4f902 
   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 72640df 
   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out c660cd0 
   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 4abda32 
   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 52a3194 
   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out d807791 
   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 35e0a30 
   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out af3d9d6 
   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out 05ef5d8 
   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out e423d14 
   ql/src/test/results/clientpositive/binary_output_format.q.out 294aabb 
   ql/src/test/results/clientpositive/bucket1.q.out f3eb15c 
   ql/src/test/results/clientpositive/bucket2.q.out 9a22160 
   ql/src/test/results/clientpositive/bucket3.q.out 8fa9c7b 
   ql/src/test/results/clientpositive/bucket4.q.out 032272b 
   ql/src/test/results/clientpositive/bucket5.q.out d19fbe5 
   ql/src/test/results/clientpositive/bucket_map_join_1.q.out 8674a6c 
   ql/src/test/results/clientpositive/bucket_map_join_2.q.out 8a5984d 
   ql/src/test/results/clientpositive/bucketcontext_1.q.out 1513515 
   ql/src/test/results/clientpositive/bucketcontext_2.q.out d18a9be 
   ql/src/test/results/clientpositive/bucketcontext_3.q.out e12c155 
   ql/src/test/results/clientpositive/bucketcontext_4.q.out 77b4882 
   ql/src/test/results/clientpositive/bucketcontext_5.q.out fa1cfc5 
   ql/src/test/results/clientpositive/bucketcontext_6.q.out aac66f8 
   ql/src/test/results/clientpositive/bucketcontext_7.q.out 78c4f94 
   ql/src/test/results/clientpositive/bucketcontext_8.q.out ad7fec9 
   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 10f1af4 
   ql/src/test/results/clientpositive/bucketmapjoin10.q.out 88ecf40 
   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4ee1fa0 
   ql/src/test/results/clientpositive/bucketmapjoin12.q.out 9253f4a 
   ql/src/test/results/clientpositive/bucketmapjoin13.q.out b380fab 
   ql/src/test/results/clientpositive/bucketmapjoin2.q.out 297412f 
   ql/src/test/results/clientpositive/bucketmapjoin3.q.out 7f307a0 
   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f0f9aee 
   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 79e1c3d 
   

[jira] [Updated] (HIVE-7298) desc database extended does not show properties of the database

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7298:


Attachment: HIVE-7298.2.patch.txt

 desc database extended does not show properties of the database
 ---

 Key: HIVE-7298
 URL: https://issues.apache.org/jira/browse/HIVE-7298
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-7298.1.patch.txt, HIVE-7298.2.patch.txt


 HIVE-6386 added owner information to desc, but not updated schema of it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7127) Handover more details on exception in hiveserver2

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7127:


Description: 
Currently, JDBC hands over exception message and error codes. But it's not 
helpful for debugging.
{noformat}
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: 
FAILED: ParseException line 1:0 cannot recognize input near 'createa' 'asd' 
'EOF'
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121)
at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:109)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:231)
at org.apache.hive.beeline.Commands.execute(Commands.java:736)
at org.apache.hive.beeline.Commands.sql(Commands.java:657)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:889)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:744)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:459)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:442)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
{noformat}

With this patch, JDBC client can get more details on hiveserver2. 

{noformat}
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling 
statement: FAILED: ParseException line 1:0 cannot recognize input near 
'createa' 'asd' 'EOF'
at org.apache.hive.service.cli.operation.SQLOperation.prepare(Unknown 
Source)
at org.apache.hive.service.cli.operation.SQLOperation.run(Unknown 
Source)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(Unknown
 Source)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(Unknown
 Source)
at org.apache.hive.service.cli.CLIService.executeStatementAsync(Unknown 
Source)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(Unknown 
Source)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(Unknown
 Source)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(Unknown
 Source)
at org.apache.thrift.ProcessFunction.process(Unknown Source)
at org.apache.thrift.TBaseProcessor.process(Unknown Source)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(Unknown 
Source)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
{noformat}


  was:
NO PRECOMMIT TESTS

Currently, JDBC hands over exception message and error codes. But it's not 
helpful for debugging.
{noformat}
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: 
FAILED: ParseException line 1:0 cannot recognize input near 'createa' 'asd' 
'EOF'
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121)
at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:109)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:231)
at org.apache.hive.beeline.Commands.execute(Commands.java:736)
at org.apache.hive.beeline.Commands.sql(Commands.java:657)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:889)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:744)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:459)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:442)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
{noformat}

With this patch, JDBC client can get more details on hiveserver2. 

{noformat}
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling 
statement: FAILED: ParseException line 1:0 cannot recognize input near 
'createa' 'asd' 'EOF'
at org.apache.hive.service.cli.operation.SQLOperation.prepare(Unknown 
Source)
at org.apache.hive.service.cli.operation.SQLOperation.run(Unknown 
Source)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(Unknown
 Source)
at 

[jira] [Commented] (HIVE-3628) Provide a way to use counters in Hive through UDF

2014-06-26 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045439#comment-14045439
 ] 

Navis commented on HIVE-3628:
-

You can get MapredContext by MapredContext.get(). It will give you JobConf, 
whch can be used to access any file or distributed cache. 

 Provide a way to use counters in Hive through UDF
 -

 Key: HIVE-3628
 URL: https://issues.apache.org/jira/browse/HIVE-3628
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Viji
Assignee: Navis
Priority: Minor
 Fix For: 0.11.0

 Attachments: HIVE-3628.D8007.1.patch, HIVE-3628.D8007.2.patch, 
 HIVE-3628.D8007.3.patch, HIVE-3628.D8007.4.patch, HIVE-3628.D8007.5.patch, 
 HIVE-3628.D8007.6.patch


 Currently it is not possible to generate counters through UDF. We should 
 support this. 
 Pig currently allows this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)

2014-06-26 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045440#comment-14045440
 ] 

Szehon Ho commented on HIVE-7220:
-

[~navis] ok ill look at that

[~gopalv] i dont think there is any change to dedup, which was called even 
before this patch.  i just moved the method.

 Empty dir in external table causes issue (root_dir_external_table.q failure)
 

 Key: HIVE-7220
 URL: https://issues.apache.org/jira/browse/HIVE-7220
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, 
 HIVE-7220.patch


 While looking at root_dir_external_table.q failure, which is doing a query on 
 an external table located at root ('/'), I noticed that latest Hadoop2 
 CombineFileInputFormat returns split representing empty directories (like 
 '/Users'), which leads to failure in Hive's CombineFileRecordReader as it 
 tries to open the directory for processing.
 Tried with an external table in a normal HDFS directory, and it also returns 
 the same error.  Looks like a real bug.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7232) VectorReduceSink is emitting incorrect JOIN keys

2014-06-26 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-7232:
--

Fix Version/s: 0.14.0

 VectorReduceSink is emitting incorrect JOIN keys
 

 Key: HIVE-7232
 URL: https://issues.apache.org/jira/browse/HIVE-7232
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 0.14.0

 Attachments: HIVE-7232-extra-logging.patch, HIVE-7232.1.patch.txt, 
 HIVE-7232.2.patch.txt, q5.explain.txt, q5.sql


 After HIVE-7121, tpc-h query5 has resulted in incorrect results.
 Thanks to [~navis], it has been tracked down to the auto-parallel settings 
 which were initialized for ReduceSinkOperator, but not for 
 VectorReduceSinkOperator. The vector version inherits, but doesn't call 
 super.initializeOp() or set up the variable correctly from ReduceSinkDesc.
 The query is tpc-h query5, with extra NULL checks just to be sure.
 {code}
 ELECT n_name,
sum(l_extendedprice * (1 - l_discount)) AS revenue
 FROM customer,
  orders,
  lineitem,
  supplier,
  nation,
  region
 WHERE c_custkey = o_custkey
   AND l_orderkey = o_orderkey
   AND l_suppkey = s_suppkey
   AND c_nationkey = s_nationkey
   AND s_nationkey = n_nationkey
   AND n_regionkey = r_regionkey
   AND r_name = 'ASIA'
   AND o_orderdate = '1994-01-01'
   AND o_orderdate  '1995-01-01'
   and l_orderkey is not null
   and c_custkey is not null
   and l_suppkey is not null
   and c_nationkey is not null
   and s_nationkey is not null
   and n_regionkey is not null
 GROUP BY n_name
 ORDER BY revenue DESC;
 {code}
 The reducer which has the issue has the following plan
 {code}
 Reducer 3
 Reduce Operator Tree:
   Join Operator
 condition map:
  Inner Join 0 to 1
 condition expressions:
   0 {KEY.reducesinkkey0} {VALUE._col2}
   1 {VALUE._col0} {KEY.reducesinkkey0} {VALUE._col3}
 outputColumnNames: _col0, _col3, _col10, _col11, _col14
 Statistics: Num rows: 18344 Data size: 95229140992 Basic 
 stats: COMPLETE Column stats: NONE
 Reduce Output Operator
   key expressions: _col10 (type: int)
   sort order: +
   Map-reduce partition columns: _col10 (type: int)
   Statistics: Num rows: 18344 Data size: 95229140992 
 Basic stats: COMPLETE Column stats: NONE
   value expressions: _col0 (type: int), _col3 (type: int), 
 _col11 (type: int), _col14 (type: string)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7299) Enable metadata only optimization on Tez

2014-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7299:
-

Status: Open  (was: Patch Available)

 Enable metadata only optimization on Tez
 

 Key: HIVE-7299
 URL: https://issues.apache.org/jira/browse/HIVE-7299
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch


 Enables the metadata only optimization (the one with OneNullRowInputFormat 
 not the query-result-from-stats optimizaton)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7299) Enable metadata only optimization on Tez

2014-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7299:
-

Status: Patch Available  (was: Open)

 Enable metadata only optimization on Tez
 

 Key: HIVE-7299
 URL: https://issues.apache.org/jira/browse/HIVE-7299
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch


 Enables the metadata only optimization (the one with OneNullRowInputFormat 
 not the query-result-from-stats optimizaton)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7299) Enable metadata only optimization on Tez

2014-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7299:
-

Attachment: HIVE-7299.2.patch

 Enable metadata only optimization on Tez
 

 Key: HIVE-7299
 URL: https://issues.apache.org/jira/browse/HIVE-7299
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch


 Enables the metadata only optimization (the one with OneNullRowInputFormat 
 not the query-result-from-stats optimizaton)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7302) Allow Auto-reducer parallelism to be turned off by a logical optimizer

2014-06-26 Thread Gopal V (JIRA)
Gopal V created HIVE-7302:
-

 Summary: Allow Auto-reducer parallelism to be turned off by a 
logical optimizer
 Key: HIVE-7302
 URL: https://issues.apache.org/jira/browse/HIVE-7302
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 0.14.0
Reporter: Gopal V
Assignee: Gopal V


Auto reducer parallelism cannot be used for cases where a custom routing 
VertexManager is used.

Allow a tri-state for ReduceSinkDesc::isAutoParallel to allow allow, disable, 
unset mechanics.

The state machine for this setting will now be 

Allowed transitions

unset - allow
unset - disable
allow - disable

with no transition case for

disable - allow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7303) IllegalMonitorStateException when stmtHandle is null in HiveStatement

2014-06-26 Thread Navis (JIRA)
Navis created HIVE-7303:
---

 Summary: IllegalMonitorStateException when stmtHandle is null in 
HiveStatement
 Key: HIVE-7303
 URL: https://issues.apache.org/jira/browse/HIVE-7303
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Navis


From http://www.mail-archive.com/dev@hive.apache.org/msg75617.html

Unlock can be called even it's not locked in some situation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [GitHub] hive pull request: Fix lock/unlock pairing

2014-06-26 Thread Navis류승우
Hive in github is just for mirroring of apache svn. So pull request cannot
be handled.

Could you make a patch and attach it on
https://issues.apache.org/jira/browse/HIVE-7303 ?

Thanks,
Navis


2014-06-26 22:29 GMT+09:00 pavel-sakun g...@git.apache.org:

 GitHub user pavel-sakun opened a pull request:

 https://github.com/apache/hive/pull/17

 Fix lock/unlock pairing

 Prevent IllegalMonitorStateException in case stmtHandle is null

 You can merge this pull request into a Git repository by running:

 $ git pull https://github.com/pavel-sakun/hive
 hive-statement-illegalmonitorstateexception

 Alternatively you can review and apply these changes as the patch at:

 https://github.com/apache/hive/pull/17.patch

 To close this pull request, make a commit to your master/trunk branch
 with (at least) the following in the commit message:

 This closes #17

 
 commit 9468a23bfe76cd5be5c747998ec0c055750db2d3
 Author: Pavel Sakun pavel_sa...@epam.com
 Date:   2014-06-26T13:26:38Z

 Fix lock/unlock pairing

 Prevent IllegalMonitorStateException in case stmtHandle is null

 


 ---
 If your project is set up for it, you can reply to this email and have your
 reply appear on GitHub as well. If your project does not have this feature
 enabled and wishes so, or if the feature is enabled but not working, please
 contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
 with INFRA.
 ---



[jira] [Commented] (HIVE-3628) Provide a way to use counters in Hive through UDF

2014-06-26 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045481#comment-14045481
 ] 

Lefty Leverenz commented on HIVE-3628:
--

Should this be documented in the wiki, or is it already documented somewhere?

The UDF wikidoc doesn't say anything about counters or JobConf:

* [Language Manual -- Operators and UDFs | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF]

 Provide a way to use counters in Hive through UDF
 -

 Key: HIVE-3628
 URL: https://issues.apache.org/jira/browse/HIVE-3628
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Viji
Assignee: Navis
Priority: Minor
 Fix For: 0.11.0

 Attachments: HIVE-3628.D8007.1.patch, HIVE-3628.D8007.2.patch, 
 HIVE-3628.D8007.3.patch, HIVE-3628.D8007.4.patch, HIVE-3628.D8007.5.patch, 
 HIVE-3628.D8007.6.patch


 Currently it is not possible to generate counters through UDF. We should 
 support this. 
 Pig currently allows this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7298) desc database extended does not show properties of the database

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045489#comment-14045489
 ] 

Hive QA commented on HIVE-7298:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652739/HIVE-7298.2.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5670 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join20
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/610/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/610/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-610/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652739

 desc database extended does not show properties of the database
 ---

 Key: HIVE-7298
 URL: https://issues.apache.org/jira/browse/HIVE-7298
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-7298.1.patch.txt, HIVE-7298.2.patch.txt


 HIVE-6386 added owner information to desc, but not updated schema of it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7299) Enable metadata only optimization on Tez

2014-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045490#comment-14045490
 ] 

Hive QA commented on HIVE-7299:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12652743/HIVE-7299.2.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/611/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/611/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-611/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-611/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 
'hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestSemanticAnalysis.java'
Reverted 'ql/src/test/results/clientpositive/database_location.q.out'
Reverted 'ql/src/test/results/clientpositive/alter_db_owner.q.out'
Reverted 'ql/src/test/results/clientpositive/database_properties.q.out'
Reverted 
'ql/src/test/results/clientpositive/authorization_owner_actions_db.q.out'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/plan/DescDatabaseDesc.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-minikdc/target itests/hive-unit/target 
itests/custom-serde/target itests/util/target hcatalog/target 
hcatalog/core/target hcatalog/streaming/target 
hcatalog/server-extensions/target hcatalog/hcatalog-pig-adapter/target 
hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hwi/target 
common/target common/src/gen service/target contrib/target serde/target 
beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target 
ql/src/test/results/clientpositive/describe_database.q.out 
ql/src/test/queries/clientpositive/describe_database.q
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1605943.

At revision 1605943.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12652743

 Enable metadata only optimization on Tez
 

 Key: HIVE-7299
 URL: https://issues.apache.org/jira/browse/HIVE-7299
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch


 Enables the metadata only optimization (the one with OneNullRowInputFormat 
 not the query-result-from-stats optimizaton)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7298) desc database extended does not show properties of the database

2014-06-26 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045498#comment-14045498
 ] 

Navis commented on HIVE-7298:
-

Fail of auto_join20.q is not related to this (Could not find status of 
job:job_local1690960560_0001). Will commit this shortly after.

 desc database extended does not show properties of the database
 ---

 Key: HIVE-7298
 URL: https://issues.apache.org/jira/browse/HIVE-7298
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-7298.1.patch.txt, HIVE-7298.2.patch.txt


 HIVE-6386 added owner information to desc, but not updated schema of it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7298) desc database extended does not show properties of the database

2014-06-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7298:


   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Ashutosh, for the review!

 desc database extended does not show properties of the database
 ---

 Key: HIVE-7298
 URL: https://issues.apache.org/jira/browse/HIVE-7298
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-7298.1.patch.txt, HIVE-7298.2.patch.txt


 HIVE-6386 added owner information to desc, but not updated schema of it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 23006: Escape control characters for explain result

2014-06-26 Thread Xuefu Zhang


 On June 26, 2014, 6:51 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java, line 385
  https://reviews.apache.org/r/23006/diff/1/?file=618088#file618088line385
 
  what difference doesn't this change make?
 
 Navis Ryu wrote:
 printf(%s ,X) == print(X) + print( ), Is it wrong(I'm asking, really)?

I'm not saying it's wrong. I assume they are equivalent. Since you made the 
change, I thought you'd have some reasoning behind it.


 On June 26, 2014, 6:51 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java, line 195
  https://reviews.apache.org/r/23006/diff/1/?file=618089#file618089line195
 
  For my understanding, why cannot we just simply replace 0x00 with a 
  different character such as ' '? Why we are dealing with quotes and commas? 
  Can you give an example what's transformed to what?
 
 Navis Ryu wrote:
 With comments with spaces, just replacing 0x00 into a space would be 
 confusing, IMHO. 
 
 comment 10x00comment 20x00 will be printed like 'comment 1','comment 
 2',''. For 0x000x00, null will be returned (nothing in comments).

I don't know how 0x00 came into the picutre at the first place. It seems 
reasonable to me that comment should contain nothing but a string.


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23006/#review46773
---


On June 26, 2014, 9:05 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/23006/
 ---
 
 (Updated June 26, 2014, 9:05 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-7024
 https://issues.apache.org/jira/browse/HIVE-7024
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Comments for columns are now delimited by 0x00, which is binary and make git 
 refuse to make proper diff file.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 92545d8 
   ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 1149bda 
   ql/src/test/results/clientpositive/alter_partition_coltype.q.out e86cc06 
   ql/src/test/results/clientpositive/annotate_stats_filter.q.out c7d58f6 
   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 6f72964 
   ql/src/test/results/clientpositive/annotate_stats_join.q.out cc816c8 
   ql/src/test/results/clientpositive/annotate_stats_part.q.out a0b4602 
   ql/src/test/results/clientpositive/annotate_stats_select.q.out 97e9473 
   ql/src/test/results/clientpositive/annotate_stats_table.q.out bb2d18c 
   ql/src/test/results/clientpositive/annotate_stats_union.q.out 6d179b6 
   ql/src/test/results/clientpositive/auto_join_reordering_values.q.out 
 3f4f902 
   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 72640df 
   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out c660cd0 
   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 4abda32 
   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 52a3194 
   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out d807791 
   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 35e0a30 
   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out af3d9d6 
   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out 05ef5d8 
   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out e423d14 
   ql/src/test/results/clientpositive/binary_output_format.q.out 294aabb 
   ql/src/test/results/clientpositive/bucket1.q.out f3eb15c 
   ql/src/test/results/clientpositive/bucket2.q.out 9a22160 
   ql/src/test/results/clientpositive/bucket3.q.out 8fa9c7b 
   ql/src/test/results/clientpositive/bucket4.q.out 032272b 
   ql/src/test/results/clientpositive/bucket5.q.out d19fbe5 
   ql/src/test/results/clientpositive/bucket_map_join_1.q.out 8674a6c 
   ql/src/test/results/clientpositive/bucket_map_join_2.q.out 8a5984d 
   ql/src/test/results/clientpositive/bucketcontext_1.q.out 1513515 
   ql/src/test/results/clientpositive/bucketcontext_2.q.out d18a9be 
   ql/src/test/results/clientpositive/bucketcontext_3.q.out e12c155 
   ql/src/test/results/clientpositive/bucketcontext_4.q.out 77b4882 
   ql/src/test/results/clientpositive/bucketcontext_5.q.out fa1cfc5 
   ql/src/test/results/clientpositive/bucketcontext_6.q.out aac66f8 
   ql/src/test/results/clientpositive/bucketcontext_7.q.out 78c4f94 
   ql/src/test/results/clientpositive/bucketcontext_8.q.out ad7fec9 
   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 10f1af4 
   ql/src/test/results/clientpositive/bucketmapjoin10.q.out 88ecf40 
   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4ee1fa0 
   ql/src/test/results/clientpositive/bucketmapjoin12.q.out 9253f4a 
   

[jira] [Updated] (HIVE-7299) Enable metadata only optimization on Tez

2014-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7299:
-

Status: Open  (was: Patch Available)

 Enable metadata only optimization on Tez
 

 Key: HIVE-7299
 URL: https://issues.apache.org/jira/browse/HIVE-7299
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-7299.1.patch, HIVE-7299.2.patch, HIVE-7299.3.patch


 Enables the metadata only optimization (the one with OneNullRowInputFormat 
 not the query-result-from-stats optimizaton)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >