[jira] [Created] (HIVE-12388) getTables cannot get external tables

2015-11-11 Thread Navis (JIRA)
Navis created HIVE-12388:


 Summary: getTables cannot get external tables
 Key: HIVE-12388
 URL: https://issues.apache.org/jira/browse/HIVE-12388
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Navis
Assignee: Navis
Priority: Critical


By regression of HIVE-7575, external tables are not shown when "TABLE" type is 
specified as argument. I'm working on this. Sorry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12373) Interner should return identical map or list

2015-11-09 Thread Navis (JIRA)
Navis created HIVE-12373:


 Summary: Interner should return identical map or list
 Key: HIVE-12373
 URL: https://issues.apache.org/jira/browse/HIVE-12373
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor


Currently, HiveStringUtils.intern(map/list) returns new instance of map or 
list. But it would break some usage style of code something like below (it's 
spark code in HiveMetastoreCatalog)

{code}
val serdeParameters = new java.util.HashMap[String, String]()
serdeInfo.setParameters(serdeParameters)
// these properties will be gone
table.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) }
p.storage.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) }
{code}

Luckily for spark, interner was not applied to released version of hive (1.2.0, 
1.2.1) by mistake. But it would make problem in someday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12183) JsonParser/Generator should be closed for resycle

2015-10-14 Thread Navis (JIRA)
Navis created HIVE-12183:


 Summary: JsonParser/Generator should be closed for resycle
 Key: HIVE-12183
 URL: https://issues.apache.org/jira/browse/HIVE-12183
 Project: Hive
  Issue Type: Bug
Reporter: Navis
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11774) Show macro definition for desc function

2015-09-09 Thread Navis (JIRA)
Navis created HIVE-11774:


 Summary: Show macro definition for desc function 
 Key: HIVE-11774
 URL: https://issues.apache.org/jira/browse/HIVE-11774
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-11774.1.patch.txt

Currently, desc function shows nothing for macro. It would be helpful if it 
shows the definition of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11756) Avoid redundant key serialization in RS for distinct query

2015-09-08 Thread Navis (JIRA)
Navis created HIVE-11756:


 Summary: Avoid redundant key serialization in RS for distinct query
 Key: HIVE-11756
 URL: https://issues.apache.org/jira/browse/HIVE-11756
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial


Currently hive serializes twice to know the length of distribution key for 
distinct queries. This introduces IndexedSerializer to avoid this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11752) Pre-materializing complex CTE queries

2015-09-07 Thread Navis (JIRA)
Navis created HIVE-11752:


 Summary: Pre-materializing complex CTE queries
 Key: HIVE-11752
 URL: https://issues.apache.org/jira/browse/HIVE-11752
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor


Currently, hive regards CTE clauses as a simple alias to the query block, which 
makes redundant works if it's used multiple times in a query. This introduces a 
reference threshold for pre-materializing the CTE clause as a volatile table 
(which is not exists in any form of metastore and just accessible from QB).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11754) Not reachable code parts in StatsUtils

2015-09-07 Thread Navis (JIRA)
Navis created HIVE-11754:


 Summary: Not reachable code parts in StatsUtils
 Key: HIVE-11754
 URL: https://issues.apache.org/jira/browse/HIVE-11754
 Project: Hive
  Issue Type: Task
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-11754.1.patch.txt

No need to check "oi instanceof WritableConstantHiveCharObjectInspector" after 
"oi instanceof ConstantObjectInspector".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11707) Implement "dump metastore"

2015-08-31 Thread Navis (JIRA)
Navis created HIVE-11707:


 Summary: Implement "dump metastore"
 Key: HIVE-11707
 URL: https://issues.apache.org/jira/browse/HIVE-11707
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Navis
Assignee: Navis
Priority: Minor


In projects, we've frequently met the need of copying existing metastore to 
other database (for other version of hive or other engines like impala, tajo, 
spark, etc.). RDBs support dumping data of metastore into series of SQLs but 
it's needed to be translated before apply if we uses different RDB which is 
time counsuming, error-prone work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11706) Implement "show create database"

2015-08-31 Thread Navis (JIRA)
Navis created HIVE-11706:


 Summary: Implement "show create database"
 Key: HIVE-11706
 URL: https://issues.apache.org/jira/browse/HIVE-11706
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Navis
Assignee: Navis
Priority: Trivial


HIVE-967 introduced "show create table". How about "show create database"?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11662) DP cannot be applied to external table which contains part-spec like directory

2015-08-26 Thread Navis (JIRA)
Navis created HIVE-11662:


 Summary: DP cannot be applied to external table which contains 
part-spec like directory
 Key: HIVE-11662
 URL: https://issues.apache.org/jira/browse/HIVE-11662
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial


Some users want to use part-spec like directory name in their partitioned table 
locations, something like,
{noformat}
/something/warehouse/some_key=some_value
{noformat}

DP calculates additional partitions from full path, and makes exception 
something like,
{noformat}
Failed with exception Partition spec {some_key=some_value, part_key=part_value} 
contains non-partition columns
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MoveTask
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11518) Provide interface to adjust required resource for tez tasks

2015-08-11 Thread Navis (JIRA)
Navis created HIVE-11518:


 Summary: Provide interface to adjust required resource for tez 
tasks
 Key: HIVE-11518
 URL: https://issues.apache.org/jira/browse/HIVE-11518
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor


Resource requirements for each tasks are varied but currently it's fixed to one 
value(via hive.tez.container.size). It would be good to customize resource 
requirements appropriate to expected work.

Suggested interface is quite simple.
{code}
public interface ResourceCalculator {

  Resource adjust(Resource resource, MapWork mapWork);

  Resource adjust(Resource resource, ReduceWork reduceWork);
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner

2015-08-10 Thread Navis (JIRA)
Navis created HIVE-11515:


 Summary: Still some possible race condition in 
DynamicPartitionPruner
 Key: HIVE-11515
 URL: https://issues.apache.org/jira/browse/HIVE-11515
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor


Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to 
reproduce but it seemed related to the fact that init() is called by 
thread-pool. With some delay in queue, events from fast tasks are arrived 
before init() is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11506) Casting varchar/char type to string cannot be vectorized

2015-08-10 Thread Navis (JIRA)
Navis created HIVE-11506:


 Summary: Casting varchar/char type to string cannot be vectorized
 Key: HIVE-11506
 URL: https://issues.apache.org/jira/browse/HIVE-11506
 Project: Hive
  Issue Type: Improvement
  Components: Vectorization
Reporter: Navis
Assignee: Navis
Priority: Trivial


It's not defined in vectorization context.
{code}
explain 
select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order 
by x;
{code}

Mapper 
{noformat}
015-08-10 17:02:08,003 INFO  [main]: physical.Vectorizer 
(Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize
org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: 
varchar(10)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11002) Memory leakage on unsafe aggregation path with empty input

2015-06-13 Thread Navis (JIRA)
Navis created HIVE-11002:


 Summary: Memory leakage on unsafe aggregation path with empty input
 Key: HIVE-11002
 URL: https://issues.apache.org/jira/browse/HIVE-11002
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Navis
Assignee: Navis
Priority: Minor


Currently, unsafe-based hash is released on 'next' call but if input is empty, 
it would not be called ever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10890) Provide implementable engine selector

2015-06-02 Thread Navis (JIRA)
Navis created HIVE-10890:


 Summary: Provide implementable engine selector
 Key: HIVE-10890
 URL: https://issues.apache.org/jira/browse/HIVE-10890
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial


Now hive supports three kind of engines. It would be good to have an automatic 
engine selector without setting explicitly engine for execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9806) Support partition locator for custom directory hierarchy

2015-02-26 Thread Navis (JIRA)
Navis created HIVE-9806:
---

 Summary: Support partition locator for custom directory hierarchy
 Key: HIVE-9806
 URL: https://issues.apache.org/jira/browse/HIVE-9806
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor


Currently, relative partition directory should be same with partition name, 
which is not always applicable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP

2015-02-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9699:

Attachment: HIVE-9699.2.patch.txt

 Extend PTFs to provide referenced columns for CP
 

 Key: HIVE-9699
 URL: https://issues.apache.org/jira/browse/HIVE-9699
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9699.1.patch.txt, HIVE-9699.2.patch.txt


 As described in HIVE-9341, If PTFs can provide referenced column names, 
 column pruner can use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP

2015-02-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9699:

Status: Patch Available  (was: Open)

 Extend PTFs to provide referenced columns for CP
 

 Key: HIVE-9699
 URL: https://issues.apache.org/jira/browse/HIVE-9699
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9699.1.patch.txt, HIVE-9699.2.patch.txt


 As described in HIVE-9341, If PTFs can provide referenced column names, 
 column pruner can use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9699) Extend PTFs to provide referenced columns for CP

2015-02-15 Thread Navis (JIRA)
Navis created HIVE-9699:
---

 Summary: Extend PTFs to provide referenced columns for CP
 Key: HIVE-9699
 URL: https://issues.apache.org/jira/browse/HIVE-9699
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial


As described in HIVE-9341, If PTFs can provide referenced column names, column 
pruner can use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP

2015-02-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9699:

Attachment: HIVE-9699.1.patch.txt

 Extend PTFs to provide referenced columns for CP
 

 Key: HIVE-9699
 URL: https://issues.apache.org/jira/browse/HIVE-9699
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9699.1.patch.txt


 As described in HIVE-9341, If PTFs can provide referenced column names, 
 column pruner can use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP

2015-02-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9699:

Status: Patch Available  (was: Open)

 Extend PTFs to provide referenced columns for CP
 

 Key: HIVE-9699
 URL: https://issues.apache.org/jira/browse/HIVE-9699
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9699.1.patch.txt


 As described in HIVE-9341, If PTFs can provide referenced column names, 
 column pruner can use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP

2015-02-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9699:

Status: Open  (was: Patch Available)

 Extend PTFs to provide referenced columns for CP
 

 Key: HIVE-9699
 URL: https://issues.apache.org/jira/browse/HIVE-9699
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9699.1.patch.txt


 As described in HIVE-9341, If PTFs can provide referenced column names, 
 column pruner can use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9680) GlobalLimitOptimizer is not checking filters correctly

2015-02-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9680:

Status: Patch Available  (was: Open)

 GlobalLimitOptimizer is not checking filters correctly 
 ---

 Key: HIVE-9680
 URL: https://issues.apache.org/jira/browse/HIVE-9680
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9680.1.patch.txt


 Some predicates can be not included in opToPartPruner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9680) GlobalLimitOptimizer is not checking filters correctly

2015-02-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9680:

Attachment: HIVE-9680.1.patch.txt

 GlobalLimitOptimizer is not checking filters correctly 
 ---

 Key: HIVE-9680
 URL: https://issues.apache.org/jira/browse/HIVE-9680
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9680.1.patch.txt


 Some predicates can be not included in opToPartPruner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9680) GlobalLimitOptimizer is not checking filters correctly

2015-02-12 Thread Navis (JIRA)
Navis created HIVE-9680:
---

 Summary: GlobalLimitOptimizer is not checking filters correctly 
 Key: HIVE-9680
 URL: https://issues.apache.org/jira/browse/HIVE-9680
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Navis
Assignee: Navis
Priority: Trivial


Some predicates can be not included in opToPartPruner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9138) Add some explain to PTF operator

2015-02-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9138:

Attachment: HIVE-9138.5.patch.txt

 Add some explain to PTF operator
 

 Key: HIVE-9138
 URL: https://issues.apache.org/jira/browse/HIVE-9138
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
 HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt, HIVE-9138.5.patch.txt


 PTFOperator does not explain anything in explain statement, making it hard to 
 understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9495) Map Side aggregation affecting map performance

2015-02-12 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319590#comment-14319590
 ] 

Navis commented on HIVE-9495:
-

I think I've broken something rebasing on trunk. 

 Map Side aggregation affecting map performance
 --

 Key: HIVE-9495
 URL: https://issues.apache.org/jira/browse/HIVE-9495
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
 Environment: RHEL 6.4
 Hortonworks Hadoop 2.2
Reporter: Anand Sridharan
 Attachments: HIVE-9495.1.patch.txt, profiler_screenshot.PNG


 When trying to run a simple aggregation query with hive.map.aggr=true, map 
 tasks take a lot of time in Hive 0.14 as against  with hive.map.aggr=false.
 e.g.
 Consider the query:
 {code}
 INSERT OVERWRITE TABLE lineitem_tgt_agg
 select alias.a0 as a0,
  alias.a2 as a1,
  alias.a1 as a2,
  alias.a3 as a3,
  alias.a4 as a4
 from (
  select alias.a0 as a0,
   SUM(alias.a1) as a1,
   SUM(alias.a2) as a2,
   SUM(alias.a3) as a3,
   SUM(alias.a4) as a4
  from (
   select lineitem_sf500.l_orderkey as a0,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - 
 lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1,
lineitem_sf500.l_quantity as a2,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
 lineitem_sf500.l_discount as double) as a3,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
 lineitem_sf500.l_tax as double) as a4
   from lineitem_sf500
   ) alias
  group by alias.a0
  ) alias;
 {code}
 The above query was run with ~376GB of data / ~3billion records in the source.
 It takes ~10 minutes with hive.map.aggr=false.
 With map side aggregation set to true, the map tasks don't complete even 
 after an hour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9597) substition variables stopping when a undefined variable occur

2015-02-12 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319577#comment-14319577
 ] 

Navis commented on HIVE-9597:
-

This seemed fixed in HIVE-6037(hive-0.14.0).

 substition variables stopping when a undefined variable occur
 -

 Key: HIVE-9597
 URL: https://issues.apache.org/jira/browse/HIVE-9597
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 0.13.0
 Environment: hortonworks 2.1
Reporter: ErwanMAS
Priority: Critical
 Fix For: 0.14.0


 {noformat}
 set hivevar:A_VALUE_1=A ;
 set hivevar:A_VALUE_3=C ;
 explain select ${A_VALUE_1},${A_VALUE_2},${A_VALUE_3} from foobar ;
 set hivevar:A_VALUE_2=B ;
 explain select ${A_VALUE_1},${A_VALUE_2},${A_VALUE_3} from foobar ;
 {noformat}
 In the first query , the variable A_VALUE_3 is not subsituted , because the 
 A_VALUE_2 is not defined !



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9597) substition variables stopping when a undefined variable occur

2015-02-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis resolved HIVE-9597.
-
   Resolution: Duplicate
Fix Version/s: 0.14.0

 substition variables stopping when a undefined variable occur
 -

 Key: HIVE-9597
 URL: https://issues.apache.org/jira/browse/HIVE-9597
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 0.13.0
 Environment: hortonworks 2.1
Reporter: ErwanMAS
Priority: Critical
 Fix For: 0.14.0


 {noformat}
 set hivevar:A_VALUE_1=A ;
 set hivevar:A_VALUE_3=C ;
 explain select ${A_VALUE_1},${A_VALUE_2},${A_VALUE_3} from foobar ;
 set hivevar:A_VALUE_2=B ;
 explain select ${A_VALUE_1},${A_VALUE_2},${A_VALUE_3} from foobar ;
 {noformat}
 In the first query , the variable A_VALUE_3 is not subsituted , because the 
 A_VALUE_2 is not defined !



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance

2015-02-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9495:

Status: Open  (was: Patch Available)

 Map Side aggregation affecting map performance
 --

 Key: HIVE-9495
 URL: https://issues.apache.org/jira/browse/HIVE-9495
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
 Environment: RHEL 6.4
 Hortonworks Hadoop 2.2
Reporter: Anand Sridharan
 Attachments: HIVE-9495.1.patch.txt, profiler_screenshot.PNG


 When trying to run a simple aggregation query with hive.map.aggr=true, map 
 tasks take a lot of time in Hive 0.14 as against  with hive.map.aggr=false.
 e.g.
 Consider the query:
 {code}
 INSERT OVERWRITE TABLE lineitem_tgt_agg
 select alias.a0 as a0,
  alias.a2 as a1,
  alias.a1 as a2,
  alias.a3 as a3,
  alias.a4 as a4
 from (
  select alias.a0 as a0,
   SUM(alias.a1) as a1,
   SUM(alias.a2) as a2,
   SUM(alias.a3) as a3,
   SUM(alias.a4) as a4
  from (
   select lineitem_sf500.l_orderkey as a0,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - 
 lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1,
lineitem_sf500.l_quantity as a2,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
 lineitem_sf500.l_discount as double) as a3,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
 lineitem_sf500.l_tax as double) as a4
   from lineitem_sf500
   ) alias
  group by alias.a0
  ) alias;
 {code}
 The above query was run with ~376GB of data / ~3billion records in the source.
 It takes ~10 minutes with hive.map.aggr=false.
 With map side aggregation set to true, the map tasks don't complete even 
 after an hour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk, at last. Thanks Jason!

 Create per-session function registry 
 -

 Key: HIVE-2573
 URL: https://issues.apache.org/jira/browse/HIVE-2573
 Project: Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 1.2.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
 HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
 HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
 HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, 
 HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, 
 HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt


 Currently the function registry is shared resource and could be overrided by 
 other users when using HiveServer. If per-session function registry is 
 provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9138) Add some explain to PTF operator

2015-02-11 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317444#comment-14317444
 ] 

Navis commented on HIVE-9138:
-

Explainable was introduced to avoid implementing Serializable just for explain 
result. I can remove this but PTFInputDef, etc. should be Serializable. 
Changes in ColumnPruner are basically for setting output shape of partition 
function for explain. It's transient fields just for building PTF at first time 
and seemed safe to change.

 Add some explain to PTF operator
 

 Key: HIVE-9138
 URL: https://issues.apache.org/jira/browse/HIVE-9138
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
 HIVE-9138.3.patch.txt


 PTFOperator does not explain anything in explain statement, making it hard to 
 understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

Attachment: HIVE-2573.15.patch.txt

Addressed comments (exept one) and cannot reproduce fails on 
TestMacroSemanticAnalyzer.

 Create per-session function registry 
 -

 Key: HIVE-2573
 URL: https://issues.apache.org/jira/browse/HIVE-2573
 Project: Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
 HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
 HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
 HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, 
 HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, 
 HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt


 Currently the function registry is shared resource and could be overrided by 
 other users when using HiveServer. If per-session function registry is 
 provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9138) Add some explain to PTF operator

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9138:

Attachment: HIVE-9138.4.patch.txt

Addressed comments

 Add some explain to PTF operator
 

 Key: HIVE-9138
 URL: https://issues.apache.org/jira/browse/HIVE-9138
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
 HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt


 PTFOperator does not explain anything in explain statement, making it hard to 
 understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9138) Add some explain to PTF operator

2015-02-11 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317645#comment-14317645
 ] 

Navis commented on HIVE-9138:
-

Wish HIVE-6470 applied to trunk some day. I hate bad indentation.

 Add some explain to PTF operator
 

 Key: HIVE-9138
 URL: https://issues.apache.org/jira/browse/HIVE-9138
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
 HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt


 PTFOperator does not explain anything in explain statement, making it hard to 
 understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9598) java.lang.IllegalMonitorStateException/java.util.concurrent.locks.ReentrantLock$Sync.tryRelease if ResultSet.closed called after Statement.close called

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis resolved HIVE-9598.
-
   Resolution: Duplicate
Fix Version/s: 0.14.0

 java.lang.IllegalMonitorStateException/java.util.concurrent.locks.ReentrantLock$Sync.tryRelease
  if ResultSet.closed called after Statement.close called
 ---

 Key: HIVE-9598
 URL: https://issues.apache.org/jira/browse/HIVE-9598
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.13.0
Reporter: N Campbell
 Fix For: 0.14.0


 http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#close()
 http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#close()
   Statement stmt;
   try {
   stmt = dbConnection.createStatement();
   stmt.executeQuery(select* from t);
   ResultSet rs = stmt.getResultSet();
   stmt.close();
   if (rs != null) {
   System.out.println(IS NOT NULL);
 // Hive does not implement isClosed()
 //if (!rs.isClosed()) {
 //System.out.println(IS NOT CLOSED);
 //}
   rs.close();
   }
   } catch (SQLException e) {
   // TODO Auto-generated catch block
   e.printStackTrace();
   }
 Exception in thread main java.lang.IllegalMonitorStateException
   at 
 java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:166)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1271)
   at 
 java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:471)
   at 
 org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:175)
   at 
 org.apache.hive.jdbc.HiveQueryResultSet.close(HiveQueryResultSet.java:293)
 /D:/JDBC/Hortonworks_Hive13/commons-configuration-1.6.jar
 /D:/JDBC/Hortonworks_Hive13/commons-logging-1.1.3.jar
 /D:/JDBC/Hortonworks_Hive13/hadoop-common-2.4.0.2.1.1.0-385.jar
 /D:/JDBC/Hortonworks_Hive13/hive-exec-0.13.0.2.1.1.0-385.jar
 /D:/JDBC/Hortonworks_Hive13/hive-jdbc-0.13.0.2.1.1.0-385.jar
 /D:/JDBC/Hortonworks_Hive13/hive-service-0.13.0.2.1.1.0-385.jar
 /D:/JDBC/Hortonworks_Hive13/httpclient-4.2.5.jar
 /D:/JDBC/Hortonworks_Hive13/httpcore-4.2.5.jar
 /D:/JDBC/Hortonworks_Hive13/libfb303-0.9.0.jar
 /D:/JDBC/Hortonworks_Hive13/libthrift-0.9.0.jar
 /D:/JDBC/Hortonworks_Hive13/log4j-1.2.16.jar
 /D:/JDBC/Hortonworks_Hive13/slf4j-api-1.7.5.jar
 /D:/JDBC/Hortonworks_Hive13/slf4j-log4j12-1.7.5.jar



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9632) inconsistent results between year(), month(), day(), and the actual values in formulas

2015-02-11 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317746#comment-14317746
 ] 

Navis commented on HIVE-9632:
-

Looks like HIVE-9278. Could you check this in hive-1.0?

 inconsistent results between year(), month(), day(), and the actual values in 
 formulas
 --

 Key: HIVE-9632
 URL: https://issues.apache.org/jira/browse/HIVE-9632
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.14.0
 Environment: CentOS 6.5, HDP 2.2
Reporter: Robert Miller

 In wanting to create a date dimension value which would match our existing 
 database environment, I figured I would be able to do as I have done in the 
 past and use the following formula:
 (year(date)*1)+(month(date)*100)+day(date)
 Given the date of 2015-01-09, the above formula should result in a value of 
 20150109.  Instead, the resulting value is 20353515.
 SELECT
adjusted_activity_date_utc,
year(adjusted_activity_date_utc),
month(adjusted_activity_date_utc),
day(adjusted_activity_date_utc),

 (year(adjusted_activity_date_utc)*1)+(month(adjusted_activity_date_utc)*100)+day(adjusted_activity_date_utc),
(year(adjusted_activity_date_utc)*1),
(month(adjusted_activity_date_utc)*100),
day(adjusted_activity_date_utc)
from event_histories limit 5;
 OK
 adjusted_activity_date_utc_c1 _c2 _c3 _c4 _c5 _c6 
 _c7
 2015-01-0920151   9   203535152015100 
 9
 2015-01-0920151   9   203535152015100 
 9
 2015-01-0920151   9   203535152015100 
 9
 2015-01-0920151   9   203535152015100 
 9
 2015-01-0920151   9   203535152015100 
 9
 Oddly enough, this works as expected when a specific date value is used for 
 the column.
 I have tried this with partition and non-partition columns and found the 
 result to be the same.
 SELECT
adjusted_activity_date_utc,
year(adjusted_activity_date_utc),
month(adjusted_activity_date_utc),
day(adjusted_activity_date_utc),

 (year(adjusted_activity_date_utc)*1)+(month(adjusted_activity_date_utc)*100)+day(adjusted_activity_date_utc),
(year(adjusted_activity_date_utc)*1),
(month(adjusted_activity_date_utc)*100),
day(adjusted_activity_date_utc)
from event_histories
where adjusted_activity_date_utc = '2015-01-09'
limit 5;
 OK
 adjusted_activity_date_utc_c1 _c2 _c3 _c4 _c5 _c6 
 _c7
 2015-01-0920151   9   201501092015100 
 9
 2015-01-0920151   9   201501092015100 
 9
 2015-01-0920151   9   201501092015100 
 9
 2015-01-0920151   9   201501092015100 
 9
 2015-01-0920151   9   201501092015100 
 9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9138) Add some explain to PTF operator

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9138:

Attachment: (was: HIVE-9138.4.patch.txt)

 Add some explain to PTF operator
 

 Key: HIVE-9138
 URL: https://issues.apache.org/jira/browse/HIVE-9138
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
 HIVE-9138.3.patch.txt


 PTFOperator does not explain anything in explain statement, making it hard to 
 understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9138) Add some explain to PTF operator

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9138:

Attachment: HIVE-9138.4.patch.txt

Missed one file

 Add some explain to PTF operator
 

 Key: HIVE-9138
 URL: https://issues.apache.org/jira/browse/HIVE-9138
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
 HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt


 PTFOperator does not explain anything in explain statement, making it hard to 
 understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9618:

Attachment: HIVE-9618.3.patch.txt

 Deduplicate RS keys for ptf/windowing
 -

 Key: HIVE-9618
 URL: https://issues.apache.org/jira/browse/HIVE-9618
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9618.1.patch.txt, HIVE-9618.2.patch.txt, 
 HIVE-9618.3.patch.txt


 Currently, partition spec containing same column for partition-by and 
 order-by makes duplicated key column for RS. For example, 
 {noformat}
 explain
 select p_mfgr, p_name, p_size, 
 rank() over (partition by p_mfgr order by p_name) as r, 
 dense_rank() over (partition by p_mfgr order by p_name) as dr, 
 sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
 unbounded preceding and current row)  as s1
 from noop(on noopwithmap(on noop(on part 
 partition by p_mfgr 
 order by p_mfgr, p_name
 )))
 {noformat}
 partition by p_mfgr order by p_mfgr, p_name makes duplicated key columns 
 like below
 {noformat}
 Reduce Output Operator
 key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
 (type: string)
 sort order: +++
 Map-reduce partition columns: p_mfgr (type: string)
 value expressions: p_size (type: int), p_retailprice (type: double)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9618:

Status: Patch Available  (was: Open)

Rebased to trunk

 Deduplicate RS keys for ptf/windowing
 -

 Key: HIVE-9618
 URL: https://issues.apache.org/jira/browse/HIVE-9618
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9618.1.patch.txt, HIVE-9618.2.patch.txt, 
 HIVE-9618.3.patch.txt


 Currently, partition spec containing same column for partition-by and 
 order-by makes duplicated key column for RS. For example, 
 {noformat}
 explain
 select p_mfgr, p_name, p_size, 
 rank() over (partition by p_mfgr order by p_name) as r, 
 dense_rank() over (partition by p_mfgr order by p_name) as dr, 
 sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
 unbounded preceding and current row)  as s1
 from noop(on noopwithmap(on noop(on part 
 partition by p_mfgr 
 order by p_mfgr, p_name
 )))
 {noformat}
 partition by p_mfgr order by p_mfgr, p_name makes duplicated key columns 
 like below
 {noformat}
 Reduce Output Operator
 key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
 (type: string)
 sort order: +++
 Map-reduce partition columns: p_mfgr (type: string)
 value expressions: p_size (type: int), p_retailprice (type: double)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9495:

Attachment: HIVE-9495.1.patch.txt

Replaced get/put call to single putIfAbsent call. But couldn't find any 
noticeable performance improvement. 

 Map Side aggregation affecting map performance
 --

 Key: HIVE-9495
 URL: https://issues.apache.org/jira/browse/HIVE-9495
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
 Environment: RHEL 6.4
 Hortonworks Hadoop 2.2
Reporter: Anand Sridharan
 Attachments: HIVE-9495.1.patch.txt, profiler_screenshot.PNG


 When trying to run a simple aggregation query with hive.map.aggr=true, map 
 tasks take a lot of time in Hive 0.14 as against  with hive.map.aggr=false.
 e.g.
 Consider the query:
 {code}
 INSERT OVERWRITE TABLE lineitem_tgt_agg
 select alias.a0 as a0,
  alias.a2 as a1,
  alias.a1 as a2,
  alias.a3 as a3,
  alias.a4 as a4
 from (
  select alias.a0 as a0,
   SUM(alias.a1) as a1,
   SUM(alias.a2) as a2,
   SUM(alias.a3) as a3,
   SUM(alias.a4) as a4
  from (
   select lineitem_sf500.l_orderkey as a0,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - 
 lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1,
lineitem_sf500.l_quantity as a2,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
 lineitem_sf500.l_discount as double) as a3,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
 lineitem_sf500.l_tax as double) as a4
   from lineitem_sf500
   ) alias
  group by alias.a0
  ) alias;
 {code}
 The above query was run with ~376GB of data / ~3billion records in the source.
 It takes ~10 minutes with hive.map.aggr=false.
 With map side aggregation set to true, the map tasks don't complete even 
 after an hour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9495:

Status: Patch Available  (was: Open)

 Map Side aggregation affecting map performance
 --

 Key: HIVE-9495
 URL: https://issues.apache.org/jira/browse/HIVE-9495
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
 Environment: RHEL 6.4
 Hortonworks Hadoop 2.2
Reporter: Anand Sridharan
 Attachments: HIVE-9495.1.patch.txt, profiler_screenshot.PNG


 When trying to run a simple aggregation query with hive.map.aggr=true, map 
 tasks take a lot of time in Hive 0.14 as against  with hive.map.aggr=false.
 e.g.
 Consider the query:
 {code}
 INSERT OVERWRITE TABLE lineitem_tgt_agg
 select alias.a0 as a0,
  alias.a2 as a1,
  alias.a1 as a2,
  alias.a3 as a3,
  alias.a4 as a4
 from (
  select alias.a0 as a0,
   SUM(alias.a1) as a1,
   SUM(alias.a2) as a2,
   SUM(alias.a3) as a3,
   SUM(alias.a4) as a4
  from (
   select lineitem_sf500.l_orderkey as a0,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - 
 lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1,
lineitem_sf500.l_quantity as a2,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
 lineitem_sf500.l_discount as double) as a3,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
 lineitem_sf500.l_tax as double) as a4
   from lineitem_sf500
   ) alias
  group by alias.a0
  ) alias;
 {code}
 The above query was run with ~376GB of data / ~3billion records in the source.
 It takes ~10 minutes with hive.map.aggr=false.
 With map side aggregation set to true, the map tasks don't complete even 
 after an hour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9486) Use session classloader instead of application loader

2015-02-09 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313515#comment-14313515
 ] 

Navis commented on HIVE-9486:
-

[~szehon] I've considered to use 'Utilities.getSessionSpecifiedClassLoader' but 
it seemed better to have one in common module using JavaUtils.getClassLoader() 
which is safe to call without hive-exec or other modules. We modifies 
SessionState.HiveConf.ClassLoader and thread context loader altogether (at 
least in hive) and it would be the same. Better idea?

 Use session classloader instead of application loader
 -

 Key: HIVE-9486
 URL: https://issues.apache.org/jira/browse/HIVE-9486
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-9486.1.patch.txt, HIVE-9486.2.patch.txt


 From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html
 Looks reasonable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9618:

Attachment: HIVE-9618.2.patch.txt

Addressed comment  updated gold file

 Deduplicate RS keys for ptf/windowing
 -

 Key: HIVE-9618
 URL: https://issues.apache.org/jira/browse/HIVE-9618
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9618.1.patch.txt, HIVE-9618.2.patch.txt


 Currently, partition spec containing same column for partition-by and 
 order-by makes duplicated key column for RS. For example, 
 {noformat}
 explain
 select p_mfgr, p_name, p_size, 
 rank() over (partition by p_mfgr order by p_name) as r, 
 dense_rank() over (partition by p_mfgr order by p_name) as dr, 
 sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
 unbounded preceding and current row)  as s1
 from noop(on noopwithmap(on noop(on part 
 partition by p_mfgr 
 order by p_mfgr, p_name
 )))
 {noformat}
 partition by p_mfgr order by p_mfgr, p_name makes duplicated key columns 
 like below
 {noformat}
 Reduce Output Operator
 key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
 (type: string)
 sort order: +++
 Map-reduce partition columns: p_mfgr (type: string)
 value expressions: p_size (type: int), p_retailprice (type: double)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9618:

Status: Patch Available  (was: Open)

 Deduplicate RS keys for ptf/windowing
 -

 Key: HIVE-9618
 URL: https://issues.apache.org/jira/browse/HIVE-9618
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9618.1.patch.txt


 Currently, partition spec containing same column for partition-by and 
 order-by makes duplicated key column for RS. For example, 
 {noformat}
 explain
 select p_mfgr, p_name, p_size, 
 rank() over (partition by p_mfgr order by p_name) as r, 
 dense_rank() over (partition by p_mfgr order by p_name) as dr, 
 sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
 unbounded preceding and current row)  as s1
 from noop(on noopwithmap(on noop(on part 
 partition by p_mfgr 
 order by p_mfgr, p_name
 )))
 {noformat}
 partition by p_mfgr order by p_mfgr, p_name makes duplicated key columns 
 like below
 {noformat}
 Reduce Output Operator
 key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
 (type: string)
 sort order: +++
 Map-reduce partition columns: p_mfgr (type: string)
 value expressions: p_size (type: int), p_retailprice (type: double)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9618:

Attachment: HIVE-9618.1.patch.txt

 Deduplicate RS keys for ptf/windowing
 -

 Key: HIVE-9618
 URL: https://issues.apache.org/jira/browse/HIVE-9618
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9618.1.patch.txt


 Currently, partition spec containing same column for partition-by and 
 order-by makes duplicated key column for RS. For example, 
 {noformat}
 explain
 select p_mfgr, p_name, p_size, 
 rank() over (partition by p_mfgr order by p_name) as r, 
 dense_rank() over (partition by p_mfgr order by p_name) as dr, 
 sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
 unbounded preceding and current row)  as s1
 from noop(on noopwithmap(on noop(on part 
 partition by p_mfgr 
 order by p_mfgr, p_name
 )))
 {noformat}
 partition by p_mfgr order by p_mfgr, p_name makes duplicated key columns 
 like below
 {noformat}
 Reduce Output Operator
 key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
 (type: string)
 sort order: +++
 Map-reduce partition columns: p_mfgr (type: string)
 value expressions: p_size (type: int), p_retailprice (type: double)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls

2015-02-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

Attachment: HIVE-9507.3.patch.txt

 Make LATERAL VIEW inline(expression) mytable tolerant to nulls
 

 Key: HIVE-9507
 URL: https://issues.apache.org/jira/browse/HIVE-9507
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Affects Versions: 0.14.0
 Environment: hdp 2.2
 Windows server 2012 R2 64-bit
Reporter: Moustafa Aboul Atta
Assignee: Navis
Priority: Minor
 Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, 
 HIVE-9507.3.patch.txt, parial_log.log


 I have tweets stored with avro on hdfs with the default twitter status 
 (tweet) schema.
 There's an object called entities that contains arrays of structs.
 When I run
  
 {{SELECT mytable.*}}
 {{FROM tweets}}
 {{LATERAL VIEW INLINE(entities.media) mytable}}
 I get the exception attached as partial_log.log, however, if I add
 {{WHERE entities.media IS NOT NULL}}
 it runs perfectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls

2015-02-09 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313743#comment-14313743
 ] 

Navis commented on HIVE-9507:
-

Yes, just fixed NPE. 

 Make LATERAL VIEW inline(expression) mytable tolerant to nulls
 

 Key: HIVE-9507
 URL: https://issues.apache.org/jira/browse/HIVE-9507
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Affects Versions: 0.14.0
 Environment: hdp 2.2
 Windows server 2012 R2 64-bit
Reporter: Moustafa Aboul Atta
Assignee: Navis
Priority: Minor
 Fix For: 1.2.0

 Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, 
 HIVE-9507.3.patch.txt, parial_log.log


 I have tweets stored with avro on hdfs with the default twitter status 
 (tweet) schema.
 There's an object called entities that contains arrays of structs.
 When I run
  
 {{SELECT mytable.*}}
 {{FROM tweets}}
 {{LATERAL VIEW INLINE(entities.media) mytable}}
 I get the exception attached as partial_log.log, however, if I add
 {{WHERE entities.media IS NOT NULL}}
 it runs perfectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls

2015-02-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Ashutosh.

 Make LATERAL VIEW inline(expression) mytable tolerant to nulls
 

 Key: HIVE-9507
 URL: https://issues.apache.org/jira/browse/HIVE-9507
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Affects Versions: 0.14.0
 Environment: hdp 2.2
 Windows server 2012 R2 64-bit
Reporter: Moustafa Aboul Atta
Assignee: Navis
Priority: Minor
 Fix For: 1.2.0

 Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, 
 HIVE-9507.3.patch.txt, parial_log.log


 I have tweets stored with avro on hdfs with the default twitter status 
 (tweet) schema.
 There's an object called entities that contains arrays of structs.
 When I run
  
 {{SELECT mytable.*}}
 {{FROM tweets}}
 {{LATERAL VIEW INLINE(entities.media) mytable}}
 I get the exception attached as partial_log.log, however, if I add
 {{WHERE entities.media IS NOT NULL}}
 it runs perfectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

Attachment: HIVE-2573.14.patch.txt

Forgot this for a long time. Rebased to trunk.

 Create per-session function registry 
 -

 Key: HIVE-2573
 URL: https://issues.apache.org/jira/browse/HIVE-2573
 Project: Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
 HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
 HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
 HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, 
 HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, 
 HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt


 Currently the function registry is shared resource and could be overrided by 
 other users when using HiveServer. If per-session function registry is 
 provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9513) NULL POINTER EXCEPTION

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9513:

Attachment: HIVE-9513.2.patch.txt

 NULL POINTER EXCEPTION
 --

 Key: HIVE-9513
 URL: https://issues.apache.org/jira/browse/HIVE-9513
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.1
Reporter: ErwanMAS
Assignee: Navis
 Attachments: HIVE-9513.1.patch.txt, HIVE-9513.2.patch.txt


 NPE duting parsing  of :
 {noformat}
 select * from (
  select * from ( select 1 as id , foo as str_1 from staging.dual ) f
   union   all
  select * from ( select 2 as id , bar as str_2 from staging.dual ) g
 ) e ;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9228) Problem with subquery using windowing functions

2015-02-08 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311850#comment-14311850
 ] 

Navis commented on HIVE-9228:
-

[~aihuaxu] Sorry for my breaking in on this issue. I've been working on codes 
around CP for other issues and not wanted others waste time to understand 
complicated PTF operation. I think the fix is almost done. Sorry again.

 Problem with subquery using windowing functions
 ---

 Key: HIVE-9228
 URL: https://issues.apache.org/jira/browse/HIVE-9228
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.13.1
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, 
 create_table_tab1.sql, tab1.csv

   Original Estimate: 96h
  Remaining Estimate: 96h

 The following query with window functions failed. The internal query works 
 fine.
 select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 
 then 1 end ) over (partition by col1, col2) as col5, row_number() over 
 (partition by col1, col2 order by col4) as col6 from tab1) t;
 HIVE generates an execution plan with 2 jobs. 
 1. The first job is to basically calculate window function for col5.  
 2. The second job is to calculate window function for col6 and output.
 The plan says the first job outputs the columns (col1, col2, col3, col4) to a 
 tmp file since only these columns are used in later stage. While, the PTF 
 operator for the first job outputs (_wcol0, col1, col2, col3, col4) with 
 _wcol0 as the result of the window function even it's not used. 
 In the second job, the map operator still reads the 4 columns (col1, col2, 
 col3, col4) from the temp file using the plan. That causes the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9228) Problem with subquery using windowing functions

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9228:

Attachment: HIVE-9228.3.patch.txt

Updated gold file

 Problem with subquery using windowing functions
 ---

 Key: HIVE-9228
 URL: https://issues.apache.org/jira/browse/HIVE-9228
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.13.1
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, 
 HIVE-9228.3.patch.txt, create_table_tab1.sql, tab1.csv

   Original Estimate: 96h
  Remaining Estimate: 96h

 The following query with window functions failed. The internal query works 
 fine.
 select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 
 then 1 end ) over (partition by col1, col2) as col5, row_number() over 
 (partition by col1, col2 order by col4) as col6 from tab1) t;
 HIVE generates an execution plan with 2 jobs. 
 1. The first job is to basically calculate window function for col5.  
 2. The second job is to calculate window function for col6 and output.
 The plan says the first job outputs the columns (col1, col2, col3, col4) to a 
 tmp file since only these columns are used in later stage. While, the PTF 
 operator for the first job outputs (_wcol0, col1, col2, col3, col4) with 
 _wcol0 as the result of the window function even it's not used. 
 In the second job, the map operator still reads the 4 columns (col1, col2, 
 col3, col4) from the temp file using the plan. That causes the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

Attachment: (was: HIVE-2573.14.patch.txt)

 Create per-session function registry 
 -

 Key: HIVE-2573
 URL: https://issues.apache.org/jira/browse/HIVE-2573
 Project: Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
 HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
 HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
 HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, 
 HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, 
 HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt


 Currently the function registry is shared resource and could be overrided by 
 other users when using HiveServer. If per-session function registry is 
 provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

Attachment: HIVE-2573.14.patch.txt

 Create per-session function registry 
 -

 Key: HIVE-2573
 URL: https://issues.apache.org/jira/browse/HIVE-2573
 Project: Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
 HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
 HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
 HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, 
 HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, 
 HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt


 Currently the function registry is shared resource and could be overrided by 
 other users when using HiveServer. If per-session function registry is 
 provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-3050) JDBC should provide metadata for columns whether a column is a partition column or not

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3050:

Attachment: HIVE-3050.1.patch.txt

 JDBC should provide metadata for columns whether a column is a partition 
 column or not
 --

 Key: HIVE-3050
 URL: https://issues.apache.org/jira/browse/HIVE-3050
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3050.1.patch.txt


 Trivial request from UI developers. 
 {code}
 DatabaseMetaData databaseMetaData = connection.getMetaData();
 ResultSet rs = databaseMetaData.getColumns(null, null, tableName, null);
 
 boolean partitionKey = rs.getBoolean(IS_PARTITION_COLUMN);
 {code}
 It's not JDBC standard column but seemed to be useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-08 Thread Navis (JIRA)
Navis created HIVE-9618:
---

 Summary: Deduplicate RS keys for ptf/windowing
 Key: HIVE-9618
 URL: https://issues.apache.org/jira/browse/HIVE-9618
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial


Currently, partition spec containing same column for partition-by and order-by 
makes duplicated key column for RS. For example, 
{noformat}
explain
select p_mfgr, p_name, p_size, 
rank() over (partition by p_mfgr order by p_name) as r, 
dense_rank() over (partition by p_mfgr order by p_name) as dr, 
sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
unbounded preceding and current row)  as s1
from noop(on noopwithmap(on noop(on part 
partition by p_mfgr 
order by p_mfgr, p_name
)))
{noformat}

partition by p_mfgr order by p_mfgr, p_name makes duplicated key columns like 
below
{noformat}
Reduce Output Operator
key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
(type: string)
sort order: +++
Map-reduce partition columns: p_mfgr (type: string)
value expressions: p_size (type: int), p_retailprice (type: double)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9615) Provide limit context for storage handlers

2015-02-08 Thread Navis (JIRA)
Navis created HIVE-9615:
---

 Summary: Provide limit context for storage handlers
 Key: HIVE-9615
 URL: https://issues.apache.org/jira/browse/HIVE-9615
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Navis
Assignee: Navis
Priority: Trivial


Propagate limit context generated from GlobalLimitOptimizer to strorage 
handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9615) Provide limit context for storage handlers

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9615:

Attachment: HIVE-9615.1.patch.txt

Old patch found from git stash

 Provide limit context for storage handlers
 --

 Key: HIVE-9615
 URL: https://issues.apache.org/jira/browse/HIVE-9615
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9615.1.patch.txt


 Propagate limit context generated from GlobalLimitOptimizer to strorage 
 handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9615) Provide limit context for storage handlers

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9615:

Status: Patch Available  (was: Open)

 Provide limit context for storage handlers
 --

 Key: HIVE-9615
 URL: https://issues.apache.org/jira/browse/HIVE-9615
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9615.1.patch.txt


 Propagate limit context generated from GlobalLimitOptimizer to strorage 
 handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9615) Provide limit context for storage handlers

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9615:

Description: Propagate limit context generated from GlobalLimitOptimizer to 
storage handlers.  (was: Propagate limit context generated from 
GlobalLimitOptimizer to strorage handlers.)

 Provide limit context for storage handlers
 --

 Key: HIVE-9615
 URL: https://issues.apache.org/jira/browse/HIVE-9615
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9615.1.patch.txt


 Propagate limit context generated from GlobalLimitOptimizer to storage 
 handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

Attachment: HIVE-9507.2.patch.txt

Reattaching for test

 Make LATERAL VIEW inline(expression) mytable tolerant to nulls
 

 Key: HIVE-9507
 URL: https://issues.apache.org/jira/browse/HIVE-9507
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Affects Versions: 0.14.0
 Environment: hdp 2.2
 Windows server 2012 R2 64-bit
Reporter: Moustafa Aboul Atta
Assignee: Navis
Priority: Minor
 Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, 
 parial_log.log


 I have tweets stored with avro on hdfs with the default twitter status 
 (tweet) schema.
 There's an object called entities that contains arrays of structs.
 When I run
  
 {{SELECT mytable.*}}
 {{FROM tweets}}
 {{LATERAL VIEW INLINE(entities.media) mytable}}
 I get the exception attached as partial_log.log, however, if I add
 {{WHERE entities.media IS NOT NULL}}
 it runs perfectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9499:

Attachment: HIVE-9499.3.patch.txt

Rebased to trunk

 hive.limit.query.max.table.partition makes queries fail on non-partitioned 
 tables
 -

 Key: HIVE-9499
 URL: https://issues.apache.org/jira/browse/HIVE-9499
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Alexander Kasper
Assignee: Navis
 Attachments: HIVE-9499.1.patch.txt, HIVE-9499.2.patch.txt, 
 HIVE-9499.3.patch.txt


 If you use hive.limit.query.max.table.partition to limit the amount of 
 partitions that can be queried it makes queries on non-partitioned tables 
 fail.
 Example:
 {noformat}
 CREATE TABLE tmp(test INT);
 SELECT COUNT(*) FROM TMP; -- works fine
 SET hive.limit.query.max.table.partition=20;
 SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
 SET hive.limit.query.max.table.partition=-1;
 SELECT COUNT(*) FROM TMP; -- works fine again
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count

2015-02-05 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308401#comment-14308401
 ] 

Navis commented on HIVE-6099:
-

+1

 Multi insert does not work properly with distinct count
 ---

 Key: HIVE-6099
 URL: https://issues.apache.org/jira/browse/HIVE-6099
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0
Reporter: Pavan Gadam Manohar
Assignee: Ashutosh Chauhan
  Labels: count, distinct, insert, multi-insert
 Attachments: HIVE-6099.1.patch, HIVE-6099.2.patch, HIVE-6099.3.patch, 
 HIVE-6099.4.patch, HIVE-6099.patch, explain_hive_0.10.0.txt, 
 with_disabled.txt, with_enabled.txt


 Need 2 rows to reproduce this Bug. Here are the steps.
 Step 1) Create a table Table_A
 CREATE EXTERNAL TABLE Table_A
 (
 user string
 , type int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/Table_A';
 Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 
 111 and 123. Insert 2 records into the table created above.
 select * from  Table_A;
 hive  select * from table_a;
 OK
 tommy   123 2013-12-02
 tommy   111 2013-12-02
 Step 3) Create 2 destination tables to simulate multi-insert.
 CREATE EXTERNAL TABLE dest_Table_A
 (
 p_date string
 , Distinct_Users int
 , Type111Users int
 , Type123Users int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/dest_Table_A';
  
 CREATE EXTERNAL TABLE dest_Table_B
 (
 p_date string
 , Distinct_Users int
 , Type111Users int
 , Type123Users int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/dest_Table_B';
 Step 4) Multi insert statement
 from Table_A a
 INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02')
 select a.dt
 ,count(distinct a.user) as AllDist
 ,count(distinct case when a.type = 111 then a.user else null end) as 
 Type111User
 ,count(distinct case when a.type != 111 then a.user else null end) as 
 Type123User
 group by a.dt
  
 INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02')
 select a.dt
 ,count(distinct a.user) as AllDist
 ,count(distinct case when a.type = 111 then a.user else null end) as 
 Type111User
 ,count(distinct case when a.type != 111 then a.user else null end) as 
 Type123User
 group by a.dt
 ;
  
 Step 5) Verify results.
 hive  select * from dest_table_a;
 OK
 2013-12-02  2   1   1   2013-12-02
 Time taken: 0.116 seconds
 hive  select * from dest_table_b;
 OK
 2013-12-02  2   1   1   2013-12-02
 Time taken: 0.13 seconds
 Conclusion: Hive gives a count of 2 for distinct users although there is 
 only one distinct user. After trying many datasets observed that Hive is 
 doing Type111Users + Typoe123Users = DistinctUsers which is wrong.
 hive select count(distinct a.user) from table_a a;
 Gives:
 Total MapReduce CPU Time Spent: 4 seconds 350 msec
 OK
 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-04 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306641#comment-14306641
 ] 

Navis commented on HIVE-9545:
-

[~ashutoshc] Could you review this? Simple changes of method invocation to 
reflection.

 Build FAILURE with IBM JVM 
 ---

 Key: HIVE-9545
 URL: https://issues.apache.org/jira/browse/HIVE-9545
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
 Environment:  mvn -version
 Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
 2014-08-11T22:58:10+02:00)
 Maven home: /opt/apache-maven-3.2.3
 Java version: 1.7.0, vendor: IBM Corporation
 Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
 Default locale: en_US, platform encoding: ISO-8859-1
 OS name: linux, version: 3.10.0-123.4.4.el7.x86_64, arch: amd64, 
 family: unix
Reporter: pascal oliva
Assignee: Navis
 Attachments: HIVE-9545.1.patch.txt


  NO PRECOMMIT TESTS 
 With the use of IBM JVM environment :
 [root@dorado-vm2 hive]# java -version
 java version 1.7.0
 Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
 IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
 20141017_217728 (JIT enabled, AOT enabled).
 The build failed on
  [INFO] Hive Query Language  FAILURE [ 50.053 
 s]
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
 on project hive-exec: Compilation failure: Compilation failure:
 [ERROR] 
 /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
  package com.sun.management does not exist.
 HOWTO : 
 #git clone -b branch-0.14 https://github.com/apache/hive.git
 #cd hive
 #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count

2015-02-03 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14304503#comment-14304503
 ] 

Navis commented on HIVE-6099:
-

[~ashutoshc] Good! I've leaved some comments in rb. I think we are purging the 
most complicated parts in GroupByOperator.

 Multi insert does not work properly with distinct count
 ---

 Key: HIVE-6099
 URL: https://issues.apache.org/jira/browse/HIVE-6099
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0
Reporter: Pavan Gadam Manohar
Assignee: Ashutosh Chauhan
  Labels: count, distinct, insert, multi-insert
 Attachments: HIVE-6099.1.patch, HIVE-6099.patch, 
 explain_hive_0.10.0.txt, with_disabled.txt, with_enabled.txt


 Need 2 rows to reproduce this Bug. Here are the steps.
 Step 1) Create a table Table_A
 CREATE EXTERNAL TABLE Table_A
 (
 user string
 , type int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/Table_A';
 Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 
 111 and 123. Insert 2 records into the table created above.
 select * from  Table_A;
 hive  select * from table_a;
 OK
 tommy   123 2013-12-02
 tommy   111 2013-12-02
 Step 3) Create 2 destination tables to simulate multi-insert.
 CREATE EXTERNAL TABLE dest_Table_A
 (
 p_date string
 , Distinct_Users int
 , Type111Users int
 , Type123Users int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/dest_Table_A';
  
 CREATE EXTERNAL TABLE dest_Table_B
 (
 p_date string
 , Distinct_Users int
 , Type111Users int
 , Type123Users int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/dest_Table_B';
 Step 4) Multi insert statement
 from Table_A a
 INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02')
 select a.dt
 ,count(distinct a.user) as AllDist
 ,count(distinct case when a.type = 111 then a.user else null end) as 
 Type111User
 ,count(distinct case when a.type != 111 then a.user else null end) as 
 Type123User
 group by a.dt
  
 INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02')
 select a.dt
 ,count(distinct a.user) as AllDist
 ,count(distinct case when a.type = 111 then a.user else null end) as 
 Type111User
 ,count(distinct case when a.type != 111 then a.user else null end) as 
 Type123User
 group by a.dt
 ;
  
 Step 5) Verify results.
 hive  select * from dest_table_a;
 OK
 2013-12-02  2   1   1   2013-12-02
 Time taken: 0.116 seconds
 hive  select * from dest_table_b;
 OK
 2013-12-02  2   1   1   2013-12-02
 Time taken: 0.13 seconds
 Conclusion: Hive gives a count of 2 for distinct users although there is 
 only one distinct user. After trying many datasets observed that Hive is 
 doing Type111Users + Typoe123Users = DistinctUsers which is wrong.
 hive select count(distinct a.user) from table_a a;
 Gives:
 Total MapReduce CPU Time Spent: 4 seconds 350 msec
 OK
 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count

2015-02-03 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14304657#comment-14304657
 ] 

Navis commented on HIVE-6099:
-

It's introduced to generated distinct keys for this optimization and seemed not 
used by other codes. The optimization seemed working with single common 
distinct column, but I think the overhead for it overrides the good part (and 
hard to read). But.. let's see the result of test.

 Multi insert does not work properly with distinct count
 ---

 Key: HIVE-6099
 URL: https://issues.apache.org/jira/browse/HIVE-6099
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0
Reporter: Pavan Gadam Manohar
Assignee: Ashutosh Chauhan
  Labels: count, distinct, insert, multi-insert
 Attachments: HIVE-6099.1.patch, HIVE-6099.2.patch, HIVE-6099.patch, 
 explain_hive_0.10.0.txt, with_disabled.txt, with_enabled.txt


 Need 2 rows to reproduce this Bug. Here are the steps.
 Step 1) Create a table Table_A
 CREATE EXTERNAL TABLE Table_A
 (
 user string
 , type int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/Table_A';
 Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 
 111 and 123. Insert 2 records into the table created above.
 select * from  Table_A;
 hive  select * from table_a;
 OK
 tommy   123 2013-12-02
 tommy   111 2013-12-02
 Step 3) Create 2 destination tables to simulate multi-insert.
 CREATE EXTERNAL TABLE dest_Table_A
 (
 p_date string
 , Distinct_Users int
 , Type111Users int
 , Type123Users int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/dest_Table_A';
  
 CREATE EXTERNAL TABLE dest_Table_B
 (
 p_date string
 , Distinct_Users int
 , Type111Users int
 , Type123Users int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/dest_Table_B';
 Step 4) Multi insert statement
 from Table_A a
 INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02')
 select a.dt
 ,count(distinct a.user) as AllDist
 ,count(distinct case when a.type = 111 then a.user else null end) as 
 Type111User
 ,count(distinct case when a.type != 111 then a.user else null end) as 
 Type123User
 group by a.dt
  
 INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02')
 select a.dt
 ,count(distinct a.user) as AllDist
 ,count(distinct case when a.type = 111 then a.user else null end) as 
 Type111User
 ,count(distinct case when a.type != 111 then a.user else null end) as 
 Type123User
 group by a.dt
 ;
  
 Step 5) Verify results.
 hive  select * from dest_table_a;
 OK
 2013-12-02  2   1   1   2013-12-02
 Time taken: 0.116 seconds
 hive  select * from dest_table_b;
 OK
 2013-12-02  2   1   1   2013-12-02
 Time taken: 0.13 seconds
 Conclusion: Hive gives a count of 2 for distinct users although there is 
 only one distinct user. After trying many datasets observed that Hive is 
 doing Type111Users + Typoe123Users = DistinctUsers which is wrong.
 hive select count(distinct a.user) from table_a a;
 Gives:
 Total MapReduce CPU Time Spent: 4 seconds 350 msec
 OK
 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9553) Fix log-line in Partition Pruner

2015-02-03 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9553:

   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Mithun Radhakrishnan.

 Fix log-line in Partition Pruner
 

 Key: HIVE-9553
 URL: https://issues.apache.org/jira/browse/HIVE-9553
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Priority: Trivial
 Fix For: 1.2.0

 Attachments: HIVE-9553.1.patch


 Minor issue in logging the prune-expression in the PartitionPruner:
 {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
 LOG.trace(prune Expression =  + prunerExpr == null ?  : prunerExpr);
 {code}
 Given the operator precedence order, this should read:
 {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
 LOG.trace(prune Expression =  + (prunerExpr == null ?  : prunerExpr));
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9566) HiveServer2 fails to start with NullPointerException

2015-02-03 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14304392#comment-14304392
 ] 

Navis commented on HIVE-9566:
-

+1

 HiveServer2 fails to start with NullPointerException
 

 Key: HIVE-9566
 URL: https://issues.apache.org/jira/browse/HIVE-9566
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0, 0.14.0, 0.13.1
Reporter: Na Yang
Assignee: Na Yang
 Attachments: HIVE-9566.patch


 hiveserver2 uses embedded metastore with default hive-site.xml configuration. 
 I use hive --stop --service hiveserver2 command to stop the running 
 hiveserver2 process and then use hive --start --service hiveserver2 command 
 to start the hiveserver2 service. I see the following exception in the 
 hive.log file
 {noformat}
 java.lang.NullPointerException
 at 
 org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:104)
 at 
 org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:138)
 at 
 org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:171)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
 {noformat}
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9397) SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS

2015-02-03 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9397:

Attachment: HIVE-9397.3.patch.txt

Updated result  fixed further more(distinct_stats was fall back to normal plan 
by exception making struct OI)

 SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
 

 Key: HIVE-9397
 URL: https://issues.apache.org/jira/browse/HIVE-9397
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 0.14.0, 0.15.0
Reporter: Damien Carol
Assignee: Navis
 Attachments: HIVE-9397.1.patch.txt, HIVE-9397.2.patch.txt, 
 HIVE-9397.3.patch.txt


 These queries produce an error :
 {code:sql}
 DROP TABLE IF EXISTS foo;
 CREATE TABLE foo (id int) STORED AS ORC;
 INSERT INTO TABLE foo VALUES (1);
 INSERT INTO TABLE foo VALUES (2);
 INSERT INTO TABLE foo VALUES (3);
 INSERT INTO TABLE foo VALUES (4);
 INSERT INTO TABLE foo VALUES (5);
 SELECT max(id) FROM foo;
 ANALYZE TABLE foo COMPUTE STATISTICS FOR COLUMNS id;
 SELECT max(id) FROM foo;
 {code}
 The last query throws {{org.apache.hive.service.cli.HiveSQLException}}
 {noformat}
 0: jdbc:hive2://nc-h04:1/casino SELECT max(id) FROM foo;
 +-+--+
 | _c0 |
 +-+--+
 org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException
 0: jdbc:hive2://nc-h04:1/casino
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count

2015-02-02 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302613#comment-14302613
 ] 

Navis commented on HIVE-6099:
-

[~ashutoshc] Could we remove this optimization? I'm sure this is not valid from 
the start.

 Multi insert does not work properly with distinct count
 ---

 Key: HIVE-6099
 URL: https://issues.apache.org/jira/browse/HIVE-6099
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0
Reporter: Pavan Gadam Manohar
Assignee: Navis
  Labels: count, distinct, insert, multi-insert
 Attachments: explain_hive_0.10.0.txt, with_disabled.txt, 
 with_enabled.txt


 Need 2 rows to reproduce this Bug. Here are the steps.
 Step 1) Create a table Table_A
 CREATE EXTERNAL TABLE Table_A
 (
 user string
 , type int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/Table_A';
 Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 
 111 and 123. Insert 2 records into the table created above.
 select * from  Table_A;
 hive  select * from table_a;
 OK
 tommy   123 2013-12-02
 tommy   111 2013-12-02
 Step 3) Create 2 destination tables to simulate multi-insert.
 CREATE EXTERNAL TABLE dest_Table_A
 (
 p_date string
 , Distinct_Users int
 , Type111Users int
 , Type123Users int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/dest_Table_A';
  
 CREATE EXTERNAL TABLE dest_Table_B
 (
 p_date string
 , Distinct_Users int
 , Type111Users int
 , Type123Users int
 )
 PARTITIONED BY (dt string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '|' 
  STORED AS RCFILE
 LOCATION '/hive/path/dest_Table_B';
 Step 4) Multi insert statement
 from Table_A a
 INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02')
 select a.dt
 ,count(distinct a.user) as AllDist
 ,count(distinct case when a.type = 111 then a.user else null end) as 
 Type111User
 ,count(distinct case when a.type != 111 then a.user else null end) as 
 Type123User
 group by a.dt
  
 INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02')
 select a.dt
 ,count(distinct a.user) as AllDist
 ,count(distinct case when a.type = 111 then a.user else null end) as 
 Type111User
 ,count(distinct case when a.type != 111 then a.user else null end) as 
 Type123User
 group by a.dt
 ;
  
 Step 5) Verify results.
 hive  select * from dest_table_a;
 OK
 2013-12-02  2   1   1   2013-12-02
 Time taken: 0.116 seconds
 hive  select * from dest_table_b;
 OK
 2013-12-02  2   1   1   2013-12-02
 Time taken: 0.13 seconds
 Conclusion: Hive gives a count of 2 for distinct users although there is 
 only one distinct user. After trying many datasets observed that Hive is 
 doing Type111Users + Typoe123Users = DistinctUsers which is wrong.
 hive select count(distinct a.user) from table_a a;
 Gives:
 Total MapReduce CPU Time Spent: 4 seconds 350 msec
 OK
 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9528) SemanticException: Ambiguous column reference

2015-02-02 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302768#comment-14302768
 ] 

Navis commented on HIVE-9528:
-

No, it's HIVE-7733. I've almost forgot the context of it but probably it was 
about enforcing unique column names in the final stage of subquery which was 
checked when generating select operator before of it.

 SemanticException: Ambiguous column reference
 -

 Key: HIVE-9528
 URL: https://issues.apache.org/jira/browse/HIVE-9528
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Yongzhi Chen
Assignee: Navis

 When running the following query:
 {code}
 SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select  *  from 
 sim a join sim2 b on a.simstr=b.simstr) app
 Error: Error while compiling statement: FAILED: SemanticException [Error 
 10007]: Ambiguous column reference simstr in app (state=42000,code=10007)
 {code}
 This query works fine in hive 0.10
 In the apache trunk, following workaround will work:
 {code}
 SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim 
 a join sim2 b on a.simstr=b.simstr) app;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9553) Fix log-line in Partition Pruner

2015-02-02 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302756#comment-14302756
 ] 

Navis commented on HIVE-9553:
-

+1

 Fix log-line in Partition Pruner
 

 Key: HIVE-9553
 URL: https://issues.apache.org/jira/browse/HIVE-9553
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Priority: Trivial
 Attachments: HIVE-9553.1.patch


 Minor issue in logging the prune-expression in the PartitionPruner:
 {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
 LOG.trace(prune Expression =  + prunerExpr == null ?  : prunerExpr);
 {code}
 Given the operator precedence order, this should read:
 {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
 LOG.trace(prune Expression =  + (prunerExpr == null ?  : prunerExpr));
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9397) SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9397:

Attachment: HIVE-9397.2.patch.txt

Addressed comments  fixed double sub-type

 SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
 

 Key: HIVE-9397
 URL: https://issues.apache.org/jira/browse/HIVE-9397
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 0.14.0, 0.15.0
Reporter: Damien Carol
Assignee: Navis
 Attachments: HIVE-9397.1.patch.txt, HIVE-9397.2.patch.txt


 These queries produce an error :
 {code:sql}
 DROP TABLE IF EXISTS foo;
 CREATE TABLE foo (id int) STORED AS ORC;
 INSERT INTO TABLE foo VALUES (1);
 INSERT INTO TABLE foo VALUES (2);
 INSERT INTO TABLE foo VALUES (3);
 INSERT INTO TABLE foo VALUES (4);
 INSERT INTO TABLE foo VALUES (5);
 SELECT max(id) FROM foo;
 ANALYZE TABLE foo COMPUTE STATISTICS FOR COLUMNS id;
 SELECT max(id) FROM foo;
 {code}
 The last query throws {{org.apache.hive.service.cli.HiveSQLException}}
 {noformat}
 0: jdbc:hive2://nc-h04:1/casino SELECT max(id) FROM foo;
 +-+--+
 | _c0 |
 +-+--+
 org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException
 0: jdbc:hive2://nc-h04:1/casino
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9528) SemanticException: Ambiguous column reference

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis resolved HIVE-9528.
-
Resolution: Not a Problem

 SemanticException: Ambiguous column reference
 -

 Key: HIVE-9528
 URL: https://issues.apache.org/jira/browse/HIVE-9528
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Yongzhi Chen
Assignee: Navis

 When running the following query:
 {code}
 SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select  *  from 
 sim a join sim2 b on a.simstr=b.simstr) app
 Error: Error while compiling statement: FAILED: SemanticException [Error 
 10007]: Ambiguous column reference simstr in app (state=42000,code=10007)
 {code}
 This query works fine in hive 0.10
 In the apache trunk, following workaround will work:
 {code}
 SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim 
 a join sim2 b on a.simstr=b.simstr) app;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9397) SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS

2015-02-02 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302848#comment-14302848
 ] 

Navis commented on HIVE-9397:
-

Now OIs are acquired directly from row schema of final GBY operator. And also 
I've fixed double to float type casting, making identical result between 
stat-optimized and not. 
It would be possible to extend StatsOptimizer to accept queries like select 
min(x)+max(x) from tbl but seemed better to be done in following issue. 

 SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
 

 Key: HIVE-9397
 URL: https://issues.apache.org/jira/browse/HIVE-9397
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 0.14.0, 0.15.0
Reporter: Damien Carol
Assignee: Navis
 Attachments: HIVE-9397.1.patch.txt, HIVE-9397.2.patch.txt


 These queries produce an error :
 {code:sql}
 DROP TABLE IF EXISTS foo;
 CREATE TABLE foo (id int) STORED AS ORC;
 INSERT INTO TABLE foo VALUES (1);
 INSERT INTO TABLE foo VALUES (2);
 INSERT INTO TABLE foo VALUES (3);
 INSERT INTO TABLE foo VALUES (4);
 INSERT INTO TABLE foo VALUES (5);
 SELECT max(id) FROM foo;
 ANALYZE TABLE foo COMPUTE STATISTICS FOR COLUMNS id;
 SELECT max(id) FROM foo;
 {code}
 The last query throws {{org.apache.hive.service.cli.HiveSQLException}}
 {noformat}
 0: jdbc:hive2://nc-h04:1/casino SELECT max(id) FROM foo;
 +-+--+
 | _c0 |
 +-+--+
 org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException
 0: jdbc:hive2://nc-h04:1/casino
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Attachment: HIVE-9495.1.patch.txt

 Build FAILURE with IBM JVM 
 ---

 Key: HIVE-9545
 URL: https://issues.apache.org/jira/browse/HIVE-9545
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
 Environment:  mvn -version
 Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
 2014-08-11T22:58:10+02:00)
 Maven home: /opt/apache-maven-3.2.3
 Java version: 1.7.0, vendor: IBM Corporation
 Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
 Default locale: en_US, platform encoding: ISO-8859-1
 OS name: linux, version: 3.10.0-123.4.4.el7.x86_64, arch: amd64, 
 family: unix
Reporter: pascal oliva
 Attachments: HIVE-9495.1.patch.txt


 With the use of IBM JVM environment :
 [root@dorado-vm2 hive]# java -version
 java version 1.7.0
 Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
 IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
 20141017_217728 (JIT enabled, AOT enabled).
 The build failed on
  [INFO] Hive Query Language  FAILURE [ 50.053 
 s]
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
 on project hive-exec: Compilation failure: Compilation failure:
 [ERROR] 
 /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
  package com.sun.management does not exist.
 HOWTO : 
 #git clone -b branch-0.14 https://github.com/apache/hive.git
 #cd hive
 #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Assignee: Navis
  Status: Patch Available  (was: Open)

 Build FAILURE with IBM JVM 
 ---

 Key: HIVE-9545
 URL: https://issues.apache.org/jira/browse/HIVE-9545
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
 Environment:  mvn -version
 Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
 2014-08-11T22:58:10+02:00)
 Maven home: /opt/apache-maven-3.2.3
 Java version: 1.7.0, vendor: IBM Corporation
 Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
 Default locale: en_US, platform encoding: ISO-8859-1
 OS name: linux, version: 3.10.0-123.4.4.el7.x86_64, arch: amd64, 
 family: unix
Reporter: pascal oliva
Assignee: Navis
 Attachments: HIVE-9495.1.patch.txt


 With the use of IBM JVM environment :
 [root@dorado-vm2 hive]# java -version
 java version 1.7.0
 Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
 IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
 20141017_217728 (JIT enabled, AOT enabled).
 The build failed on
  [INFO] Hive Query Language  FAILURE [ 50.053 
 s]
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
 on project hive-exec: Compilation failure: Compilation failure:
 [ERROR] 
 /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
  package com.sun.management does not exist.
 HOWTO : 
 #git clone -b branch-0.14 https://github.com/apache/hive.git
 #cd hive
 #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Attachment: (was: HIVE-9495.1.patch.txt)

 Build FAILURE with IBM JVM 
 ---

 Key: HIVE-9545
 URL: https://issues.apache.org/jira/browse/HIVE-9545
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
 Environment:  mvn -version
 Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
 2014-08-11T22:58:10+02:00)
 Maven home: /opt/apache-maven-3.2.3
 Java version: 1.7.0, vendor: IBM Corporation
 Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
 Default locale: en_US, platform encoding: ISO-8859-1
 OS name: linux, version: 3.10.0-123.4.4.el7.x86_64, arch: amd64, 
 family: unix
Reporter: pascal oliva
Assignee: Navis
 Attachments: HIVE-9545.1.patch.txt


 With the use of IBM JVM environment :
 [root@dorado-vm2 hive]# java -version
 java version 1.7.0
 Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
 IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
 20141017_217728 (JIT enabled, AOT enabled).
 The build failed on
  [INFO] Hive Query Language  FAILURE [ 50.053 
 s]
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
 on project hive-exec: Compilation failure: Compilation failure:
 [ERROR] 
 /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
  package com.sun.management does not exist.
 HOWTO : 
 #git clone -b branch-0.14 https://github.com/apache/hive.git
 #cd hive
 #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Description: 
 NO PRECOMMIT TESTS 

With the use of IBM JVM environment :
[root@dorado-vm2 hive]# java -version
java version 1.7.0
Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20141017_217728 (JIT enabled, AOT enabled).

The build failed on
 [INFO] Hive Query Language  FAILURE [ 50.053 s]
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-exec: Compilation failure: Compilation failure:
[ERROR] 
/home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
 package com.sun.management does not exist.

HOWTO : 
#git clone -b branch-0.14 https://github.com/apache/hive.git
#cd hive
#mvn  install -DskipTests -Phadoop-2












  was:

With the use of IBM JVM environment :
[root@dorado-vm2 hive]# java -version
java version 1.7.0
Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20141017_217728 (JIT enabled, AOT enabled).

The build failed on
 [INFO] Hive Query Language  FAILURE [ 50.053 s]
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-exec: Compilation failure: Compilation failure:
[ERROR] 
/home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
 package com.sun.management does not exist.

HOWTO : 
#git clone -b branch-0.14 https://github.com/apache/hive.git
#cd hive
#mvn  install -DskipTests -Phadoop-2













 Build FAILURE with IBM JVM 
 ---

 Key: HIVE-9545
 URL: https://issues.apache.org/jira/browse/HIVE-9545
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
 Environment:  mvn -version
 Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
 2014-08-11T22:58:10+02:00)
 Maven home: /opt/apache-maven-3.2.3
 Java version: 1.7.0, vendor: IBM Corporation
 Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
 Default locale: en_US, platform encoding: ISO-8859-1
 OS name: linux, version: 3.10.0-123.4.4.el7.x86_64, arch: amd64, 
 family: unix
Reporter: pascal oliva
Assignee: Navis
 Attachments: HIVE-9545.1.patch.txt


  NO PRECOMMIT TESTS 
 With the use of IBM JVM environment :
 [root@dorado-vm2 hive]# java -version
 java version 1.7.0
 Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
 IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
 20141017_217728 (JIT enabled, AOT enabled).
 The build failed on
  [INFO] Hive Query Language  FAILURE [ 50.053 
 s]
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
 on project hive-exec: Compilation failure: Compilation failure:
 [ERROR] 
 /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
  package com.sun.management does not exist.
 HOWTO : 
 #git clone -b branch-0.14 https://github.com/apache/hive.git
 #cd hive
 #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Description: 

With the use of IBM JVM environment :
[root@dorado-vm2 hive]# java -version
java version 1.7.0
Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20141017_217728 (JIT enabled, AOT enabled).

The build failed on
 [INFO] Hive Query Language  FAILURE [ 50.053 s]
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-exec: Compilation failure: Compilation failure:
[ERROR] 
/home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
 package com.sun.management does not exist.

HOWTO : 
#git clone -b branch-0.14 https://github.com/apache/hive.git
#cd hive
#mvn  install -DskipTests -Phadoop-2












  was:
With the use of IBM JVM environment :
[root@dorado-vm2 hive]# java -version
java version 1.7.0
Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20141017_217728 (JIT enabled, AOT enabled).

The build failed on
 [INFO] Hive Query Language  FAILURE [ 50.053 s]
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-exec: Compilation failure: Compilation failure:
[ERROR] 
/home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
 package com.sun.management does not exist.

HOWTO : 
#git clone -b branch-0.14 https://github.com/apache/hive.git
#cd hive
#mvn  install -DskipTests -Phadoop-2













 Build FAILURE with IBM JVM 
 ---

 Key: HIVE-9545
 URL: https://issues.apache.org/jira/browse/HIVE-9545
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
 Environment:  mvn -version
 Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
 2014-08-11T22:58:10+02:00)
 Maven home: /opt/apache-maven-3.2.3
 Java version: 1.7.0, vendor: IBM Corporation
 Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
 Default locale: en_US, platform encoding: ISO-8859-1
 OS name: linux, version: 3.10.0-123.4.4.el7.x86_64, arch: amd64, 
 family: unix
Reporter: pascal oliva
Assignee: Navis
 Attachments: HIVE-9545.1.patch.txt


 With the use of IBM JVM environment :
 [root@dorado-vm2 hive]# java -version
 java version 1.7.0
 Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
 IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
 20141017_217728 (JIT enabled, AOT enabled).
 The build failed on
  [INFO] Hive Query Language  FAILURE [ 50.053 
 s]
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
 on project hive-exec: Compilation failure: Compilation failure:
 [ERROR] 
 /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
  package com.sun.management does not exist.
 HOWTO : 
 #git clone -b branch-0.14 https://github.com/apache/hive.git
 #cd hive
 #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Attachment: HIVE-9545.1.patch.txt

 Build FAILURE with IBM JVM 
 ---

 Key: HIVE-9545
 URL: https://issues.apache.org/jira/browse/HIVE-9545
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
 Environment:  mvn -version
 Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
 2014-08-11T22:58:10+02:00)
 Maven home: /opt/apache-maven-3.2.3
 Java version: 1.7.0, vendor: IBM Corporation
 Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
 Default locale: en_US, platform encoding: ISO-8859-1
 OS name: linux, version: 3.10.0-123.4.4.el7.x86_64, arch: amd64, 
 family: unix
Reporter: pascal oliva
Assignee: Navis
 Attachments: HIVE-9545.1.patch.txt


 With the use of IBM JVM environment :
 [root@dorado-vm2 hive]# java -version
 java version 1.7.0
 Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
 IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
 20141017_217728 (JIT enabled, AOT enabled).
 The build failed on
  [INFO] Hive Query Language  FAILURE [ 50.053 
 s]
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
 on project hive-exec: Compilation failure: Compilation failure:
 [ERROR] 
 /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
  package com.sun.management does not exist.
 HOWTO : 
 #git clone -b branch-0.14 https://github.com/apache/hive.git
 #cd hive
 #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9528) SemanticException: Ambiguous column reference

2015-02-01 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300800#comment-14300800
 ] 

Navis commented on HIVE-9528:
-

[~ychena], before HIVE-7733, column information was overwritten by last column 
with same name, which possibly making invalid result. Anyway, the query you've 
mentioned is not working in mysql either (works in psql, though). Can we 
resolve this as a not-problem?

 SemanticException: Ambiguous column reference
 -

 Key: HIVE-9528
 URL: https://issues.apache.org/jira/browse/HIVE-9528
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Yongzhi Chen

 When running the following query:
 {code}
 SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select  *  from 
 sim a join sim2 b on a.simstr=b.simstr) app
 Error: Error while compiling statement: FAILED: SemanticException [Error 
 10007]: Ambiguous column reference simstr in app (state=42000,code=10007)
 {code}
 This query works fine in hive 0.10
 In the apache trunk, following workaround will work:
 {code}
 SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim 
 a join sim2 b on a.simstr=b.simstr) app;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-9528) SemanticException: Ambiguous column reference

2015-02-01 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis reassigned HIVE-9528:
---

Assignee: Navis

 SemanticException: Ambiguous column reference
 -

 Key: HIVE-9528
 URL: https://issues.apache.org/jira/browse/HIVE-9528
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Yongzhi Chen
Assignee: Navis

 When running the following query:
 {code}
 SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select  *  from 
 sim a join sim2 b on a.simstr=b.simstr) app
 Error: Error while compiling statement: FAILED: SemanticException [Error 
 10007]: Ambiguous column reference simstr in app (state=42000,code=10007)
 {code}
 This query works fine in hive 0.10
 In the apache trunk, following workaround will work:
 {code}
 SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim 
 a join sim2 b on a.simstr=b.simstr) app;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9416) Get rid of Extract Operator

2015-02-01 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300801#comment-14300801
 ] 

Navis commented on HIVE-9416:
-

+1

 Get rid of Extract Operator
 ---

 Key: HIVE-9416
 URL: https://issues.apache.org/jira/browse/HIVE-9416
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9416.1.patch, HIVE-9416.2.patch, HIVE-9416.3.patch, 
 HIVE-9416.4.patch, HIVE-9416.5.patch, HIVE-9416.6.patch, HIVE-9416.7.patch, 
 HIVE-9416.patch


 {{Extract Operator}} has been there for legacy reasons. But there is no 
 functionality it provides which cant be provided by {{Select Operator}} 
 Instead of having two operators, one being subset of another we should just 
 get rid of {{Extract}} and simplify our codebase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

Assignee: Navis
  Status: Patch Available  (was: Open)

 Make LATERAL VIEW inline(expression) mytable tolerant to nulls
 

 Key: HIVE-9507
 URL: https://issues.apache.org/jira/browse/HIVE-9507
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Affects Versions: 0.14.0
 Environment: hdp 2.2
 Windows server 2012 R2 64-bit
Reporter: Moustafa Aboul Atta
Assignee: Navis
 Attachments: HIVE-9507.1.patch.txt


 I have tweets stored with avro on hdfs with the default twitter status 
 (tweet) schema.
 There's an object called entities that contains arrays of structs.
 When I run
  
 {{SELECT mytable.*}}
 {{FROM tweets}}
 {{LATERAL VIEW INLINE(entities.media) mytable}}
 I get the exception found hereunder, however if I add
 {{WHERE entities.media IS NOT NULL}}
 it runs perfectly.
 Here's the partial log:
 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Status: Running (Executing on YARN cluster with App id 
 application_1422267635031_0618)
 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: -/-
 2015-01-29 10:15:02,526 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:05,551 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:08,722 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:12,095 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:12,354 INFO  log.PerfLogger 
 (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 
 from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor
 2015-01-29 10:15:12,354 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+5)/13   
 2015-01-29 10:15:12,557 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6)/13   
 2015-01-29 10:15:15,691 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6)/13   
 2015-01-29 10:15:18,892 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-1)/13
 2015-01-29 10:15:19,094 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-3)/13
 2015-01-29 10:15:19,304 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-5)/13
 2015-01-29 10:15:19,507 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-6)/13
 2015-01-29 10:15:22,641 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-6)/13
 2015-01-29 10:15:24,704 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:27,735 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:30,957 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:34,095 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:35,138 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-9)/13
 2015-01-29 10:15:36,503 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-10)/13   
 2015-01-29 10:15:36,710 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-11)/13   
 2015-01-29 10:15:37,971 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-12)/13   
 2015-01-29 10:15:39,800 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-13)/13   
 2015-01-29 10:15:41,175 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-14)/13   
 2015-01-29 10:15:44,414 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-14)/13   
 2015-01-29 10:15:45,447 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-15)/13   
 2015-01-29 10:15:47,413 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-16)/13   
 2015-01-29 10:15:47,618 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-17)/13   
 2015-01-29 10:15:49,568 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-18)/13   
 2015-01-29 10:15:51,099 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+0,-19)/13   
 2015-01-29 10:15:51,331 ERROR SessionState 
 (SessionState.java:printError(833)) - Status: Failed
 2015-01-29 10:15:51,417 ERROR SessionState 
 (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, 
 vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, 
 taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.lang.RuntimeException: 
 

[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

Attachment: HIVE-9507.1.patch.txt

 Make LATERAL VIEW inline(expression) mytable tolerant to nulls
 

 Key: HIVE-9507
 URL: https://issues.apache.org/jira/browse/HIVE-9507
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Affects Versions: 0.14.0
 Environment: hdp 2.2
 Windows server 2012 R2 64-bit
Reporter: Moustafa Aboul Atta
 Attachments: HIVE-9507.1.patch.txt


 I have tweets stored with avro on hdfs with the default twitter status 
 (tweet) schema.
 There's an object called entities that contains arrays of structs.
 When I run
  
 {{SELECT mytable.*}}
 {{FROM tweets}}
 {{LATERAL VIEW INLINE(entities.media) mytable}}
 I get the exception found hereunder, however if I add
 {{WHERE entities.media IS NOT NULL}}
 it runs perfectly.
 Here's the partial log:
 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Status: Running (Executing on YARN cluster with App id 
 application_1422267635031_0618)
 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: -/-
 2015-01-29 10:15:02,526 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:05,551 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:08,722 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:12,095 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:12,354 INFO  log.PerfLogger 
 (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 
 from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor
 2015-01-29 10:15:12,354 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+5)/13   
 2015-01-29 10:15:12,557 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6)/13   
 2015-01-29 10:15:15,691 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6)/13   
 2015-01-29 10:15:18,892 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-1)/13
 2015-01-29 10:15:19,094 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-3)/13
 2015-01-29 10:15:19,304 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-5)/13
 2015-01-29 10:15:19,507 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-6)/13
 2015-01-29 10:15:22,641 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-6)/13
 2015-01-29 10:15:24,704 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:27,735 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:30,957 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:34,095 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:35,138 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-9)/13
 2015-01-29 10:15:36,503 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-10)/13   
 2015-01-29 10:15:36,710 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-11)/13   
 2015-01-29 10:15:37,971 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-12)/13   
 2015-01-29 10:15:39,800 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-13)/13   
 2015-01-29 10:15:41,175 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-14)/13   
 2015-01-29 10:15:44,414 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-14)/13   
 2015-01-29 10:15:45,447 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-15)/13   
 2015-01-29 10:15:47,413 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-16)/13   
 2015-01-29 10:15:47,618 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-17)/13   
 2015-01-29 10:15:49,568 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-18)/13   
 2015-01-29 10:15:51,099 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+0,-19)/13   
 2015-01-29 10:15:51,331 ERROR SessionState 
 (SessionState.java:printError(833)) - Status: Failed
 2015-01-29 10:15:51,417 ERROR SessionState 
 (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, 
 vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, 
 taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.lang.RuntimeException: 
 java.lang.RuntimeException: 

[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9499:

Description: 
If you use hive.limit.query.max.table.partition to limit the amount of 
partitions that can be queried it makes queries on non-partitioned tables fail.

Example:
{noformat}
CREATE TABLE tmp(test INT);
SELECT COUNT(*) FROM TMP; -- works fine
SET hive.limit.query.max.table.partition=20;
SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
SET hive.limit.query.max.table.partition=-1;
SELECT COUNT(*) FROM TMP; -- works fine again
{noformat}

  was:
If you use hive.limit.query.max.table.partition to limit the amount of 
partitions that can be queried it makes queries on non-partitioned tables fail.

Example:
CREATE TABLE tmp(test INT);
SELECT COUNT(*) FROM TMP; -- works fine
SET hive.limit.query.max.table.partition=20;
SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
SET hive.limit.query.max.table.partition=-1;
SELECT COUNT(*) FROM TMP; -- works fine again


 hive.limit.query.max.table.partition makes queries fail on non-partitioned 
 tables
 -

 Key: HIVE-9499
 URL: https://issues.apache.org/jira/browse/HIVE-9499
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Alexander Kasper

 If you use hive.limit.query.max.table.partition to limit the amount of 
 partitions that can be queried it makes queries on non-partitioned tables 
 fail.
 Example:
 {noformat}
 CREATE TABLE tmp(test INT);
 SELECT COUNT(*) FROM TMP; -- works fine
 SET hive.limit.query.max.table.partition=20;
 SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
 SET hive.limit.query.max.table.partition=-1;
 SELECT COUNT(*) FROM TMP; -- works fine again
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9499:

Assignee: Navis
  Status: Patch Available  (was: Open)

 hive.limit.query.max.table.partition makes queries fail on non-partitioned 
 tables
 -

 Key: HIVE-9499
 URL: https://issues.apache.org/jira/browse/HIVE-9499
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Alexander Kasper
Assignee: Navis
 Attachments: HIVE-9499.1.patch.txt


 If you use hive.limit.query.max.table.partition to limit the amount of 
 partitions that can be queried it makes queries on non-partitioned tables 
 fail.
 Example:
 {noformat}
 CREATE TABLE tmp(test INT);
 SELECT COUNT(*) FROM TMP; -- works fine
 SET hive.limit.query.max.table.partition=20;
 SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
 SET hive.limit.query.max.table.partition=-1;
 SELECT COUNT(*) FROM TMP; -- works fine again
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9499:

Attachment: HIVE-9499.1.patch.txt

 hive.limit.query.max.table.partition makes queries fail on non-partitioned 
 tables
 -

 Key: HIVE-9499
 URL: https://issues.apache.org/jira/browse/HIVE-9499
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Alexander Kasper
 Attachments: HIVE-9499.1.patch.txt


 If you use hive.limit.query.max.table.partition to limit the amount of 
 partitions that can be queried it makes queries on non-partitioned tables 
 fail.
 Example:
 {noformat}
 CREATE TABLE tmp(test INT);
 SELECT COUNT(*) FROM TMP; -- works fine
 SET hive.limit.query.max.table.partition=20;
 SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
 SET hive.limit.query.max.table.partition=-1;
 SELECT COUNT(*) FROM TMP; -- works fine again
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9499:

Attachment: HIVE-9499.2.patch.txt

fixed trivial bug in TableScanStatsRule

 hive.limit.query.max.table.partition makes queries fail on non-partitioned 
 tables
 -

 Key: HIVE-9499
 URL: https://issues.apache.org/jira/browse/HIVE-9499
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Alexander Kasper
Assignee: Navis
 Attachments: HIVE-9499.1.patch.txt, HIVE-9499.2.patch.txt


 If you use hive.limit.query.max.table.partition to limit the amount of 
 partitions that can be queried it makes queries on non-partitioned tables 
 fail.
 Example:
 {noformat}
 CREATE TABLE tmp(test INT);
 SELECT COUNT(*) FROM TMP; -- works fine
 SET hive.limit.query.max.table.partition=20;
 SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
 SET hive.limit.query.max.table.partition=-1;
 SELECT COUNT(*) FROM TMP; -- works fine again
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9513) NULL POINTER EXCEPTION

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9513:

Assignee: Navis
  Status: Patch Available  (was: Open)

 NULL POINTER EXCEPTION
 --

 Key: HIVE-9513
 URL: https://issues.apache.org/jira/browse/HIVE-9513
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.1
Reporter: ErwanMAS
Assignee: Navis
 Attachments: HIVE-9513.1.patch.txt


 NPE duting parsing  of :
 {noformat}
 select * from (
  select * from ( select 1 as id , foo as str_1 from staging.dual ) f
   union   all
  select * from ( select 2 as id , bar as str_2 from staging.dual ) g
 ) e ;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9513) NULL POINTER EXCEPTION

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9513:

Attachment: HIVE-9513.1.patch.txt

 NULL POINTER EXCEPTION
 --

 Key: HIVE-9513
 URL: https://issues.apache.org/jira/browse/HIVE-9513
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.1
Reporter: ErwanMAS
 Attachments: HIVE-9513.1.patch.txt


 NPE duting parsing  of :
 {noformat}
 select * from (
  select * from ( select 1 as id , foo as str_1 from staging.dual ) f
   union   all
  select * from ( select 2 as id , bar as str_2 from staging.dual ) g
 ) e ;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9486) Use session classloader instead of application loader

2015-01-28 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9486:

Attachment: HIVE-9486.2.patch.txt

 Use session classloader instead of application loader
 -

 Key: HIVE-9486
 URL: https://issues.apache.org/jira/browse/HIVE-9486
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-9486.1.patch.txt, HIVE-9486.2.patch.txt


 From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html
 Looks reasonable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9228) Problem with subquery using windowing functions

2015-01-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9228:

Attachment: HIVE-9228.2.patch.txt

Rerunning tests

 Problem with subquery using windowing functions
 ---

 Key: HIVE-9228
 URL: https://issues.apache.org/jira/browse/HIVE-9228
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.13.1
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, 
 create_table_tab1.sql, tab1.csv

   Original Estimate: 96h
  Remaining Estimate: 96h

 The following query with window functions failed. The internal query works 
 fine.
 select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 
 then 1 end ) over (partition by col1, col2) as col5, row_number() over 
 (partition by col1, col2 order by col4) as col6 from tab1) t;
 HIVE generates an execution plan with 2 jobs. 
 1. The first job is to basically calculate window function for col5.  
 2. The second job is to calculate window function for col6 and output.
 The plan says the first job outputs the columns (col1, col2, col3, col4) to a 
 tmp file since only these columns are used in later stage. While, the PTF 
 operator for the first job outputs (_wcol0, col1, col2, col3, col4) with 
 _wcol0 as the result of the window function even it's not used. 
 In the second job, the map operator still reads the 4 columns (col1, col2, 
 col3, col4) from the temp file using the plan. That causes the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9486) Use session classloader instead of application loader

2015-01-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9486:

Attachment: HIVE-9486.1.patch.txt

 Use session classloader instead of application loader
 -

 Key: HIVE-9486
 URL: https://issues.apache.org/jira/browse/HIVE-9486
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-9486.1.patch.txt


 From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html
 Looks reasonable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9228) Problem with subquery using windowing functions

2015-01-27 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294595#comment-14294595
 ] 

Navis commented on HIVE-9228:
-

Yes, when PTF column is not selected, we should prune the function itself in 
PTF operator. But I thought it's trivial case not to select the column which 
was calculated with heavy cost. And select operator would be removed by 
IdentityProjectRemover if it's not needed. 
By the way, could you review HIVE-9138 first? It's hard to debug something on 
PTF without any explain result.

 Problem with subquery using windowing functions
 ---

 Key: HIVE-9228
 URL: https://issues.apache.org/jira/browse/HIVE-9228
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.13.1
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, 
 create_table_tab1.sql, tab1.csv

   Original Estimate: 96h
  Remaining Estimate: 96h

 The following query with window functions failed. The internal query works 
 fine.
 select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 
 then 1 end ) over (partition by col1, col2) as col5, row_number() over 
 (partition by col1, col2 order by col4) as col6 from tab1) t;
 HIVE generates an execution plan with 2 jobs. 
 1. The first job is to basically calculate window function for col5.  
 2. The second job is to calculate window function for col6 and output.
 The plan says the first job outputs the columns (col1, col2, col3, col4) to a 
 tmp file since only these columns are used in later stage. While, the PTF 
 operator for the first job outputs (_wcol0, col1, col2, col3, col4) with 
 _wcol0 as the result of the window function even it's not used. 
 In the second job, the map operator still reads the 4 columns (col1, col2, 
 col3, col4) from the temp file using the plan. That causes the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9138) Add some explain to PTF operator

2015-01-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9138:

Attachment: HIVE-9138.3.patch.txt

Updated test results

 Add some explain to PTF operator
 

 Key: HIVE-9138
 URL: https://issues.apache.org/jira/browse/HIVE-9138
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
 HIVE-9138.3.patch.txt


 PTFOperator does not explain anything in explain statement, making it hard to 
 understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9486) Use session classloader instead of application loader

2015-01-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9486:

Status: Patch Available  (was: Open)

 Use session classloader instead of application loader
 -

 Key: HIVE-9486
 URL: https://issues.apache.org/jira/browse/HIVE-9486
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-9486.1.patch.txt


 From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html
 Looks reasonable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >