[jira] [Created] (HIVE-12388) getTables cannot get external tables

2015-11-11 Thread Navis (JIRA)
Navis created HIVE-12388:


 Summary: getTables cannot get external tables
 Key: HIVE-12388
 URL: https://issues.apache.org/jira/browse/HIVE-12388
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Navis
Assignee: Navis
Priority: Critical


By regression of HIVE-7575, external tables are not shown when "TABLE" type is 
specified as argument. I'm working on this. Sorry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12373) Interner should return identical map or list

2015-11-09 Thread Navis (JIRA)
Navis created HIVE-12373:


 Summary: Interner should return identical map or list
 Key: HIVE-12373
 URL: https://issues.apache.org/jira/browse/HIVE-12373
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor


Currently, HiveStringUtils.intern(map/list) returns new instance of map or 
list. But it would break some usage style of code something like below (it's 
spark code in HiveMetastoreCatalog)

{code}
val serdeParameters = new java.util.HashMap[String, String]()
serdeInfo.setParameters(serdeParameters)
// these properties will be gone
table.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) }
p.storage.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) }
{code}

Luckily for spark, interner was not applied to released version of hive (1.2.0, 
1.2.1) by mistake. But it would make problem in someday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12183) JsonParser/Generator should be closed for resycle

2015-10-14 Thread Navis (JIRA)
Navis created HIVE-12183:


 Summary: JsonParser/Generator should be closed for resycle
 Key: HIVE-12183
 URL: https://issues.apache.org/jira/browse/HIVE-12183
 Project: Hive
  Issue Type: Bug
Reporter: Navis
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11774) Show macro definition for desc function

2015-09-09 Thread Navis (JIRA)
Navis created HIVE-11774:


 Summary: Show macro definition for desc function 
 Key: HIVE-11774
 URL: https://issues.apache.org/jira/browse/HIVE-11774
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-11774.1.patch.txt

Currently, desc function shows nothing for macro. It would be helpful if it 
shows the definition of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11756) Avoid redundant key serialization in RS for distinct query

2015-09-08 Thread Navis (JIRA)
Navis created HIVE-11756:


 Summary: Avoid redundant key serialization in RS for distinct query
 Key: HIVE-11756
 URL: https://issues.apache.org/jira/browse/HIVE-11756
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial


Currently hive serializes twice to know the length of distribution key for 
distinct queries. This introduces IndexedSerializer to avoid this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11754) Not reachable code parts in StatsUtils

2015-09-07 Thread Navis (JIRA)
Navis created HIVE-11754:


 Summary: Not reachable code parts in StatsUtils
 Key: HIVE-11754
 URL: https://issues.apache.org/jira/browse/HIVE-11754
 Project: Hive
  Issue Type: Task
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-11754.1.patch.txt

No need to check "oi instanceof WritableConstantHiveCharObjectInspector" after 
"oi instanceof ConstantObjectInspector".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11752) Pre-materializing complex CTE queries

2015-09-07 Thread Navis (JIRA)
Navis created HIVE-11752:


 Summary: Pre-materializing complex CTE queries
 Key: HIVE-11752
 URL: https://issues.apache.org/jira/browse/HIVE-11752
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor


Currently, hive regards CTE clauses as a simple alias to the query block, which 
makes redundant works if it's used multiple times in a query. This introduces a 
reference threshold for pre-materializing the CTE clause as a volatile table 
(which is not exists in any form of metastore and just accessible from QB).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11707) Implement "dump metastore"

2015-08-31 Thread Navis (JIRA)
Navis created HIVE-11707:


 Summary: Implement "dump metastore"
 Key: HIVE-11707
 URL: https://issues.apache.org/jira/browse/HIVE-11707
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Navis
Assignee: Navis
Priority: Minor


In projects, we've frequently met the need of copying existing metastore to 
other database (for other version of hive or other engines like impala, tajo, 
spark, etc.). RDBs support dumping data of metastore into series of SQLs but 
it's needed to be translated before apply if we uses different RDB which is 
time counsuming, error-prone work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11706) Implement "show create database"

2015-08-31 Thread Navis (JIRA)
Navis created HIVE-11706:


 Summary: Implement "show create database"
 Key: HIVE-11706
 URL: https://issues.apache.org/jira/browse/HIVE-11706
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Navis
Assignee: Navis
Priority: Trivial


HIVE-967 introduced "show create table". How about "show create database"?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11662) DP cannot be applied to external table which contains part-spec like directory

2015-08-26 Thread Navis (JIRA)
Navis created HIVE-11662:


 Summary: DP cannot be applied to external table which contains 
part-spec like directory
 Key: HIVE-11662
 URL: https://issues.apache.org/jira/browse/HIVE-11662
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial


Some users want to use part-spec like directory name in their partitioned table 
locations, something like,
{noformat}
/something/warehouse/some_key=some_value
{noformat}

DP calculates additional partitions from full path, and makes exception 
something like,
{noformat}
Failed with exception Partition spec {some_key=some_value, part_key=part_value} 
contains non-partition columns
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MoveTask
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11518) Provide interface to adjust required resource for tez tasks

2015-08-11 Thread Navis (JIRA)
Navis created HIVE-11518:


 Summary: Provide interface to adjust required resource for tez 
tasks
 Key: HIVE-11518
 URL: https://issues.apache.org/jira/browse/HIVE-11518
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor


Resource requirements for each tasks are varied but currently it's fixed to one 
value(via hive.tez.container.size). It would be good to customize resource 
requirements appropriate to expected work.

Suggested interface is quite simple.
{code}
public interface ResourceCalculator {

  Resource adjust(Resource resource, MapWork mapWork);

  Resource adjust(Resource resource, ReduceWork reduceWork);
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner

2015-08-10 Thread Navis (JIRA)
Navis created HIVE-11515:


 Summary: Still some possible race condition in 
DynamicPartitionPruner
 Key: HIVE-11515
 URL: https://issues.apache.org/jira/browse/HIVE-11515
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor


Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to 
reproduce but it seemed related to the fact that init() is called by 
thread-pool. With some delay in queue, events from fast tasks are arrived 
before init() is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11506) Casting varchar/char type to string cannot be vectorized

2015-08-10 Thread Navis (JIRA)
Navis created HIVE-11506:


 Summary: Casting varchar/char type to string cannot be vectorized
 Key: HIVE-11506
 URL: https://issues.apache.org/jira/browse/HIVE-11506
 Project: Hive
  Issue Type: Improvement
  Components: Vectorization
Reporter: Navis
Assignee: Navis
Priority: Trivial


It's not defined in vectorization context.
{code}
explain 
select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order 
by x;
{code}

Mapper 
{noformat}
015-08-10 17:02:08,003 INFO  [main]: physical.Vectorizer 
(Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize
org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: 
varchar(10)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11002) Memory leakage on unsafe aggregation path with empty input

2015-06-13 Thread Navis (JIRA)
Navis created HIVE-11002:


 Summary: Memory leakage on unsafe aggregation path with empty input
 Key: HIVE-11002
 URL: https://issues.apache.org/jira/browse/HIVE-11002
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Navis
Assignee: Navis
Priority: Minor


Currently, unsafe-based hash is released on 'next' call but if input is empty, 
it would not be called ever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10890) Provide implementable engine selector

2015-06-02 Thread Navis (JIRA)
Navis created HIVE-10890:


 Summary: Provide implementable engine selector
 Key: HIVE-10890
 URL: https://issues.apache.org/jira/browse/HIVE-10890
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial


Now hive supports three kind of engines. It would be good to have an automatic 
engine selector without setting explicitly engine for execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9806) Support partition locator for custom directory hierarchy

2015-02-26 Thread Navis (JIRA)
Navis created HIVE-9806:
---

 Summary: Support partition locator for custom directory hierarchy
 Key: HIVE-9806
 URL: https://issues.apache.org/jira/browse/HIVE-9806
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor


Currently, relative partition directory should be same with partition name, 
which is not always applicable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP

2015-02-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9699:

Attachment: HIVE-9699.2.patch.txt

> Extend PTFs to provide referenced columns for CP
> 
>
> Key: HIVE-9699
> URL: https://issues.apache.org/jira/browse/HIVE-9699
> Project: Hive
>  Issue Type: Improvement
>  Components: PTF-Windowing
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9699.1.patch.txt, HIVE-9699.2.patch.txt
>
>
> As described in HIVE-9341, If PTFs can provide referenced column names, 
> column pruner can use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP

2015-02-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9699:

Status: Patch Available  (was: Open)

> Extend PTFs to provide referenced columns for CP
> 
>
> Key: HIVE-9699
> URL: https://issues.apache.org/jira/browse/HIVE-9699
> Project: Hive
>  Issue Type: Improvement
>  Components: PTF-Windowing
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9699.1.patch.txt, HIVE-9699.2.patch.txt
>
>
> As described in HIVE-9341, If PTFs can provide referenced column names, 
> column pruner can use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP

2015-02-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9699:

Status: Open  (was: Patch Available)

> Extend PTFs to provide referenced columns for CP
> 
>
> Key: HIVE-9699
> URL: https://issues.apache.org/jira/browse/HIVE-9699
> Project: Hive
>  Issue Type: Improvement
>  Components: PTF-Windowing
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9699.1.patch.txt
>
>
> As described in HIVE-9341, If PTFs can provide referenced column names, 
> column pruner can use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP

2015-02-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9699:

Status: Patch Available  (was: Open)

> Extend PTFs to provide referenced columns for CP
> 
>
> Key: HIVE-9699
> URL: https://issues.apache.org/jira/browse/HIVE-9699
> Project: Hive
>  Issue Type: Improvement
>  Components: PTF-Windowing
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9699.1.patch.txt
>
>
> As described in HIVE-9341, If PTFs can provide referenced column names, 
> column pruner can use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP

2015-02-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9699:

Attachment: HIVE-9699.1.patch.txt

> Extend PTFs to provide referenced columns for CP
> 
>
> Key: HIVE-9699
> URL: https://issues.apache.org/jira/browse/HIVE-9699
> Project: Hive
>  Issue Type: Improvement
>  Components: PTF-Windowing
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9699.1.patch.txt
>
>
> As described in HIVE-9341, If PTFs can provide referenced column names, 
> column pruner can use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9699) Extend PTFs to provide referenced columns for CP

2015-02-15 Thread Navis (JIRA)
Navis created HIVE-9699:
---

 Summary: Extend PTFs to provide referenced columns for CP
 Key: HIVE-9699
 URL: https://issues.apache.org/jira/browse/HIVE-9699
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial


As described in HIVE-9341, If PTFs can provide referenced column names, column 
pruner can use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk, at last. Thanks Jason!

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
> HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
> HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
> HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, 
> HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, 
> HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9138) Add some explain to PTF operator

2015-02-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9138:

Attachment: HIVE-9138.5.patch.txt

> Add some explain to PTF operator
> 
>
> Key: HIVE-9138
> URL: https://issues.apache.org/jira/browse/HIVE-9138
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
> HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt, HIVE-9138.5.patch.txt
>
>
> PTFOperator does not explain anything in explain statement, making it hard to 
> understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9495) Map Side aggregation affecting map performance

2015-02-12 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319590#comment-14319590
 ] 

Navis commented on HIVE-9495:
-

I think I've broken something rebasing on trunk. 

> Map Side aggregation affecting map performance
> --
>
> Key: HIVE-9495
> URL: https://issues.apache.org/jira/browse/HIVE-9495
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
> Environment: RHEL 6.4
> Hortonworks Hadoop 2.2
>Reporter: Anand Sridharan
> Attachments: HIVE-9495.1.patch.txt, profiler_screenshot.PNG
>
>
> When trying to run a simple aggregation query with hive.map.aggr=true, map 
> tasks take a lot of time in Hive 0.14 as against  with hive.map.aggr=false.
> e.g.
> Consider the query:
> {code}
> INSERT OVERWRITE TABLE lineitem_tgt_agg
> select alias.a0 as a0,
>  alias.a2 as a1,
>  alias.a1 as a2,
>  alias.a3 as a3,
>  alias.a4 as a4
> from (
>  select alias.a0 as a0,
>   SUM(alias.a1) as a1,
>   SUM(alias.a2) as a2,
>   SUM(alias.a3) as a3,
>   SUM(alias.a4) as a4
>  from (
>   select lineitem_sf500.l_orderkey as a0,
>CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - 
> lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1,
>lineitem_sf500.l_quantity as a2,
>CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
> lineitem_sf500.l_discount as double) as a3,
>CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
> lineitem_sf500.l_tax as double) as a4
>   from lineitem_sf500
>   ) alias
>  group by alias.a0
>  ) alias;
> {code}
> The above query was run with ~376GB of data / ~3billion records in the source.
> It takes ~10 minutes with hive.map.aggr=false.
> With map side aggregation set to true, the map tasks don't complete even 
> after an hour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance

2015-02-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9495:

Status: Open  (was: Patch Available)

> Map Side aggregation affecting map performance
> --
>
> Key: HIVE-9495
> URL: https://issues.apache.org/jira/browse/HIVE-9495
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
> Environment: RHEL 6.4
> Hortonworks Hadoop 2.2
>Reporter: Anand Sridharan
> Attachments: HIVE-9495.1.patch.txt, profiler_screenshot.PNG
>
>
> When trying to run a simple aggregation query with hive.map.aggr=true, map 
> tasks take a lot of time in Hive 0.14 as against  with hive.map.aggr=false.
> e.g.
> Consider the query:
> {code}
> INSERT OVERWRITE TABLE lineitem_tgt_agg
> select alias.a0 as a0,
>  alias.a2 as a1,
>  alias.a1 as a2,
>  alias.a3 as a3,
>  alias.a4 as a4
> from (
>  select alias.a0 as a0,
>   SUM(alias.a1) as a1,
>   SUM(alias.a2) as a2,
>   SUM(alias.a3) as a3,
>   SUM(alias.a4) as a4
>  from (
>   select lineitem_sf500.l_orderkey as a0,
>CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - 
> lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1,
>lineitem_sf500.l_quantity as a2,
>CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
> lineitem_sf500.l_discount as double) as a3,
>CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
> lineitem_sf500.l_tax as double) as a4
>   from lineitem_sf500
>   ) alias
>  group by alias.a0
>  ) alias;
> {code}
> The above query was run with ~376GB of data / ~3billion records in the source.
> It takes ~10 minutes with hive.map.aggr=false.
> With map side aggregation set to true, the map tasks don't complete even 
> after an hour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9597) substition variables stopping when a undefined variable occur

2015-02-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis resolved HIVE-9597.
-
   Resolution: Duplicate
Fix Version/s: 0.14.0

> substition variables stopping when a undefined variable occur
> -
>
> Key: HIVE-9597
> URL: https://issues.apache.org/jira/browse/HIVE-9597
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 0.13.0
> Environment: hortonworks 2.1
>Reporter: ErwanMAS
>Priority: Critical
> Fix For: 0.14.0
>
>
> {noformat}
> set hivevar:A_VALUE_1=A ;
> set hivevar:A_VALUE_3=C ;
> explain select "${A_VALUE_1}","${A_VALUE_2}","${A_VALUE_3}" from foobar ;
> set hivevar:A_VALUE_2=B ;
> explain select "${A_VALUE_1}","${A_VALUE_2}","${A_VALUE_3}" from foobar ;
> {noformat}
> In the first query , the variable A_VALUE_3 is not subsituted , because the 
> A_VALUE_2 is not defined !



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9597) substition variables stopping when a undefined variable occur

2015-02-12 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319577#comment-14319577
 ] 

Navis commented on HIVE-9597:
-

This seemed fixed in HIVE-6037(hive-0.14.0).

> substition variables stopping when a undefined variable occur
> -
>
> Key: HIVE-9597
> URL: https://issues.apache.org/jira/browse/HIVE-9597
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 0.13.0
> Environment: hortonworks 2.1
>Reporter: ErwanMAS
>Priority: Critical
> Fix For: 0.14.0
>
>
> {noformat}
> set hivevar:A_VALUE_1=A ;
> set hivevar:A_VALUE_3=C ;
> explain select "${A_VALUE_1}","${A_VALUE_2}","${A_VALUE_3}" from foobar ;
> set hivevar:A_VALUE_2=B ;
> explain select "${A_VALUE_1}","${A_VALUE_2}","${A_VALUE_3}" from foobar ;
> {noformat}
> In the first query , the variable A_VALUE_3 is not subsituted , because the 
> A_VALUE_2 is not defined !



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9680) GlobalLimitOptimizer is not checking filters correctly

2015-02-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9680:

Attachment: HIVE-9680.1.patch.txt

> GlobalLimitOptimizer is not checking filters correctly 
> ---
>
> Key: HIVE-9680
> URL: https://issues.apache.org/jira/browse/HIVE-9680
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9680.1.patch.txt
>
>
> Some predicates can be not included in opToPartPruner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9680) GlobalLimitOptimizer is not checking filters correctly

2015-02-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9680:

Status: Patch Available  (was: Open)

> GlobalLimitOptimizer is not checking filters correctly 
> ---
>
> Key: HIVE-9680
> URL: https://issues.apache.org/jira/browse/HIVE-9680
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9680.1.patch.txt
>
>
> Some predicates can be not included in opToPartPruner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9680) GlobalLimitOptimizer is not checking filters correctly

2015-02-12 Thread Navis (JIRA)
Navis created HIVE-9680:
---

 Summary: GlobalLimitOptimizer is not checking filters correctly 
 Key: HIVE-9680
 URL: https://issues.apache.org/jira/browse/HIVE-9680
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Navis
Assignee: Navis
Priority: Trivial


Some predicates can be not included in opToPartPruner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9495:

Attachment: HIVE-9495.1.patch.txt

Replaced get/put call to single putIfAbsent call. But couldn't find any 
noticeable performance improvement. 

> Map Side aggregation affecting map performance
> --
>
> Key: HIVE-9495
> URL: https://issues.apache.org/jira/browse/HIVE-9495
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
> Environment: RHEL 6.4
> Hortonworks Hadoop 2.2
>Reporter: Anand Sridharan
> Attachments: HIVE-9495.1.patch.txt, profiler_screenshot.PNG
>
>
> When trying to run a simple aggregation query with hive.map.aggr=true, map 
> tasks take a lot of time in Hive 0.14 as against  with hive.map.aggr=false.
> e.g.
> Consider the query:
> {code}
> INSERT OVERWRITE TABLE lineitem_tgt_agg
> select alias.a0 as a0,
>  alias.a2 as a1,
>  alias.a1 as a2,
>  alias.a3 as a3,
>  alias.a4 as a4
> from (
>  select alias.a0 as a0,
>   SUM(alias.a1) as a1,
>   SUM(alias.a2) as a2,
>   SUM(alias.a3) as a3,
>   SUM(alias.a4) as a4
>  from (
>   select lineitem_sf500.l_orderkey as a0,
>CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - 
> lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1,
>lineitem_sf500.l_quantity as a2,
>CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
> lineitem_sf500.l_discount as double) as a3,
>CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
> lineitem_sf500.l_tax as double) as a4
>   from lineitem_sf500
>   ) alias
>  group by alias.a0
>  ) alias;
> {code}
> The above query was run with ~376GB of data / ~3billion records in the source.
> It takes ~10 minutes with hive.map.aggr=false.
> With map side aggregation set to true, the map tasks don't complete even 
> after an hour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9495:

Status: Patch Available  (was: Open)

> Map Side aggregation affecting map performance
> --
>
> Key: HIVE-9495
> URL: https://issues.apache.org/jira/browse/HIVE-9495
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
> Environment: RHEL 6.4
> Hortonworks Hadoop 2.2
>Reporter: Anand Sridharan
> Attachments: HIVE-9495.1.patch.txt, profiler_screenshot.PNG
>
>
> When trying to run a simple aggregation query with hive.map.aggr=true, map 
> tasks take a lot of time in Hive 0.14 as against  with hive.map.aggr=false.
> e.g.
> Consider the query:
> {code}
> INSERT OVERWRITE TABLE lineitem_tgt_agg
> select alias.a0 as a0,
>  alias.a2 as a1,
>  alias.a1 as a2,
>  alias.a3 as a3,
>  alias.a4 as a4
> from (
>  select alias.a0 as a0,
>   SUM(alias.a1) as a1,
>   SUM(alias.a2) as a2,
>   SUM(alias.a3) as a3,
>   SUM(alias.a4) as a4
>  from (
>   select lineitem_sf500.l_orderkey as a0,
>CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - 
> lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1,
>lineitem_sf500.l_quantity as a2,
>CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
> lineitem_sf500.l_discount as double) as a3,
>CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
> lineitem_sf500.l_tax as double) as a4
>   from lineitem_sf500
>   ) alias
>  group by alias.a0
>  ) alias;
> {code}
> The above query was run with ~376GB of data / ~3billion records in the source.
> It takes ~10 minutes with hive.map.aggr=false.
> With map side aggregation set to true, the map tasks don't complete even 
> after an hour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9598) java.lang.IllegalMonitorStateException/java.util.concurrent.locks.ReentrantLock$Sync.tryRelease if ResultSet.closed called after Statement.close called

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis resolved HIVE-9598.
-
   Resolution: Duplicate
Fix Version/s: 0.14.0

> java.lang.IllegalMonitorStateException/java.util.concurrent.locks.ReentrantLock$Sync.tryRelease
>  if ResultSet.closed called after Statement.close called
> ---
>
> Key: HIVE-9598
> URL: https://issues.apache.org/jira/browse/HIVE-9598
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.13.0
>Reporter: N Campbell
> Fix For: 0.14.0
>
>
> http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#close()
> http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#close()
>   Statement stmt;
>   try {
>   stmt = dbConnection.createStatement();
>   stmt.executeQuery("select* from t");
>   ResultSet rs = stmt.getResultSet();
>   stmt.close();
>   if (rs != null) {
>   System.out.println("IS NOT NULL");
> // Hive does not implement isClosed()
> //if (!rs.isClosed()) {
> //System.out.println("IS NOT CLOSED");
> //}
>   rs.close();
>   }
>   } catch (SQLException e) {
>   // TODO Auto-generated catch block
>   e.printStackTrace();
>   }
> Exception in thread "main" java.lang.IllegalMonitorStateException
>   at 
> java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:166)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1271)
>   at 
> java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:471)
>   at 
> org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:175)
>   at 
> org.apache.hive.jdbc.HiveQueryResultSet.close(HiveQueryResultSet.java:293)
> /D:/JDBC/Hortonworks_Hive13/commons-configuration-1.6.jar
> /D:/JDBC/Hortonworks_Hive13/commons-logging-1.1.3.jar
> /D:/JDBC/Hortonworks_Hive13/hadoop-common-2.4.0.2.1.1.0-385.jar
> /D:/JDBC/Hortonworks_Hive13/hive-exec-0.13.0.2.1.1.0-385.jar
> /D:/JDBC/Hortonworks_Hive13/hive-jdbc-0.13.0.2.1.1.0-385.jar
> /D:/JDBC/Hortonworks_Hive13/hive-service-0.13.0.2.1.1.0-385.jar
> /D:/JDBC/Hortonworks_Hive13/httpclient-4.2.5.jar
> /D:/JDBC/Hortonworks_Hive13/httpcore-4.2.5.jar
> /D:/JDBC/Hortonworks_Hive13/libfb303-0.9.0.jar
> /D:/JDBC/Hortonworks_Hive13/libthrift-0.9.0.jar
> /D:/JDBC/Hortonworks_Hive13/log4j-1.2.16.jar
> /D:/JDBC/Hortonworks_Hive13/slf4j-api-1.7.5.jar
> /D:/JDBC/Hortonworks_Hive13/slf4j-log4j12-1.7.5.jar



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9632) inconsistent results between year(), month(), day(), and the actual values in formulas

2015-02-11 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317746#comment-14317746
 ] 

Navis commented on HIVE-9632:
-

Looks like HIVE-9278. Could you check this in hive-1.0?

> inconsistent results between year(), month(), day(), and the actual values in 
> formulas
> --
>
> Key: HIVE-9632
> URL: https://issues.apache.org/jira/browse/HIVE-9632
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.14.0
> Environment: CentOS 6.5, HDP 2.2
>Reporter: Robert Miller
>
> In wanting to create a date dimension value which would match our existing 
> database environment, I figured I would be able to do as I have done in the 
> past and use the following formula:
> (year(date)*1)+(month(date)*100)+day(date)
> Given the date of 2015-01-09, the above formula should result in a value of 
> 20150109.  Instead, the resulting value is 20353515.
> SELECT
>   > adjusted_activity_date_utc,
>   > year(adjusted_activity_date_utc),
>   > month(adjusted_activity_date_utc),
>   > day(adjusted_activity_date_utc),
>   > 
> (year(adjusted_activity_date_utc)*1)+(month(adjusted_activity_date_utc)*100)+day(adjusted_activity_date_utc),
>   > (year(adjusted_activity_date_utc)*1),
>   > (month(adjusted_activity_date_utc)*100),
>   > day(adjusted_activity_date_utc)
>   > from event_histories limit 5;
> OK
> adjusted_activity_date_utc_c1 _c2 _c3 _c4 _c5 _c6 
> _c7
> 2015-01-0920151   9   203535152015100 
> 9
> 2015-01-0920151   9   203535152015100 
> 9
> 2015-01-0920151   9   203535152015100 
> 9
> 2015-01-0920151   9   203535152015100 
> 9
> 2015-01-0920151   9   203535152015100 
> 9
> Oddly enough, this works as expected when a specific date value is used for 
> the column.
> I have tried this with partition and non-partition columns and found the 
> result to be the same.
> SELECT
>   > adjusted_activity_date_utc,
>   > year(adjusted_activity_date_utc),
>   > month(adjusted_activity_date_utc),
>   > day(adjusted_activity_date_utc),
>   > 
> (year(adjusted_activity_date_utc)*1)+(month(adjusted_activity_date_utc)*100)+day(adjusted_activity_date_utc),
>   > (year(adjusted_activity_date_utc)*1),
>   > (month(adjusted_activity_date_utc)*100),
>   > day(adjusted_activity_date_utc)
>   > from event_histories
>   > where adjusted_activity_date_utc = '2015-01-09'
>   > limit 5;
> OK
> adjusted_activity_date_utc_c1 _c2 _c3 _c4 _c5 _c6 
> _c7
> 2015-01-0920151   9   201501092015100 
> 9
> 2015-01-0920151   9   201501092015100 
> 9
> 2015-01-0920151   9   201501092015100 
> 9
> 2015-01-0920151   9   201501092015100 
> 9
> 2015-01-0920151   9   201501092015100 
> 9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9138) Add some explain to PTF operator

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9138:

Attachment: HIVE-9138.4.patch.txt

Missed one file

> Add some explain to PTF operator
> 
>
> Key: HIVE-9138
> URL: https://issues.apache.org/jira/browse/HIVE-9138
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
> HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt
>
>
> PTFOperator does not explain anything in explain statement, making it hard to 
> understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9138) Add some explain to PTF operator

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9138:

Attachment: (was: HIVE-9138.4.patch.txt)

> Add some explain to PTF operator
> 
>
> Key: HIVE-9138
> URL: https://issues.apache.org/jira/browse/HIVE-9138
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
> HIVE-9138.3.patch.txt
>
>
> PTFOperator does not explain anything in explain statement, making it hard to 
> understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9138) Add some explain to PTF operator

2015-02-11 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317645#comment-14317645
 ] 

Navis commented on HIVE-9138:
-

Wish HIVE-6470 applied to trunk some day. I hate bad indentation.

> Add some explain to PTF operator
> 
>
> Key: HIVE-9138
> URL: https://issues.apache.org/jira/browse/HIVE-9138
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
> HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt
>
>
> PTFOperator does not explain anything in explain statement, making it hard to 
> understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9138) Add some explain to PTF operator

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9138:

Attachment: HIVE-9138.4.patch.txt

Addressed comments

> Add some explain to PTF operator
> 
>
> Key: HIVE-9138
> URL: https://issues.apache.org/jira/browse/HIVE-9138
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
> HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt
>
>
> PTFOperator does not explain anything in explain statement, making it hard to 
> understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9618:

Attachment: HIVE-9618.3.patch.txt

> Deduplicate RS keys for ptf/windowing
> -
>
> Key: HIVE-9618
> URL: https://issues.apache.org/jira/browse/HIVE-9618
> Project: Hive
>  Issue Type: Improvement
>  Components: PTF-Windowing
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9618.1.patch.txt, HIVE-9618.2.patch.txt, 
> HIVE-9618.3.patch.txt
>
>
> Currently, partition spec containing same column for partition-by and 
> order-by makes duplicated key column for RS. For example, 
> {noformat}
> explain
> select p_mfgr, p_name, p_size, 
> rank() over (partition by p_mfgr order by p_name) as r, 
> dense_rank() over (partition by p_mfgr order by p_name) as dr, 
> sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
> unbounded preceding and current row)  as s1
> from noop(on noopwithmap(on noop(on part 
> partition by p_mfgr 
> order by p_mfgr, p_name
> )))
> {noformat}
> "partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns 
> like below
> {noformat}
> Reduce Output Operator
> key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
> (type: string)
> sort order: +++
> Map-reduce partition columns: p_mfgr (type: string)
> value expressions: p_size (type: int), p_retailprice (type: double)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9618:

Status: Patch Available  (was: Open)

Rebased to trunk

> Deduplicate RS keys for ptf/windowing
> -
>
> Key: HIVE-9618
> URL: https://issues.apache.org/jira/browse/HIVE-9618
> Project: Hive
>  Issue Type: Improvement
>  Components: PTF-Windowing
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9618.1.patch.txt, HIVE-9618.2.patch.txt, 
> HIVE-9618.3.patch.txt
>
>
> Currently, partition spec containing same column for partition-by and 
> order-by makes duplicated key column for RS. For example, 
> {noformat}
> explain
> select p_mfgr, p_name, p_size, 
> rank() over (partition by p_mfgr order by p_name) as r, 
> dense_rank() over (partition by p_mfgr order by p_name) as dr, 
> sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
> unbounded preceding and current row)  as s1
> from noop(on noopwithmap(on noop(on part 
> partition by p_mfgr 
> order by p_mfgr, p_name
> )))
> {noformat}
> "partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns 
> like below
> {noformat}
> Reduce Output Operator
> key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
> (type: string)
> sort order: +++
> Map-reduce partition columns: p_mfgr (type: string)
> value expressions: p_size (type: int), p_retailprice (type: double)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9138) Add some explain to PTF operator

2015-02-11 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317444#comment-14317444
 ] 

Navis commented on HIVE-9138:
-

Explainable was introduced to avoid implementing Serializable just for explain 
result. I can remove this but PTFInputDef, etc. should be Serializable. 
Changes in ColumnPruner are basically for setting output shape of partition 
function for explain. It's transient fields just for building PTF at first time 
and seemed safe to change.

> Add some explain to PTF operator
> 
>
> Key: HIVE-9138
> URL: https://issues.apache.org/jira/browse/HIVE-9138
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
> HIVE-9138.3.patch.txt
>
>
> PTFOperator does not explain anything in explain statement, making it hard to 
> understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

Attachment: HIVE-2573.15.patch.txt

Addressed comments (exept one) and cannot reproduce fails on 
TestMacroSemanticAnalyzer.

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
> HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
> HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
> HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, 
> HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, 
> HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls

2015-02-09 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313743#comment-14313743
 ] 

Navis commented on HIVE-9507:
-

Yes, just fixed NPE. 

> Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
> 
>
> Key: HIVE-9507
> URL: https://issues.apache.org/jira/browse/HIVE-9507
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, UDF
>Affects Versions: 0.14.0
> Environment: hdp 2.2
> Windows server 2012 R2 64-bit
>Reporter: Moustafa Aboul Atta
>Assignee: Navis
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, 
> HIVE-9507.3.patch.txt, parial_log.log
>
>
> I have tweets stored with avro on hdfs with the default twitter status 
> (tweet) schema.
> There's an object called "entities" that contains arrays of structs.
> When I run
>  
> {{SELECT mytable.*}}
> {{FROM tweets}}
> {{LATERAL VIEW INLINE(entities.media) mytable}}
> I get the exception attached as partial_log.log, however, if I add
> {{WHERE entities.media IS NOT NULL}}
> it runs perfectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls

2015-02-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Ashutosh.

> Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
> 
>
> Key: HIVE-9507
> URL: https://issues.apache.org/jira/browse/HIVE-9507
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, UDF
>Affects Versions: 0.14.0
> Environment: hdp 2.2
> Windows server 2012 R2 64-bit
>Reporter: Moustafa Aboul Atta
>Assignee: Navis
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, 
> HIVE-9507.3.patch.txt, parial_log.log
>
>
> I have tweets stored with avro on hdfs with the default twitter status 
> (tweet) schema.
> There's an object called "entities" that contains arrays of structs.
> When I run
>  
> {{SELECT mytable.*}}
> {{FROM tweets}}
> {{LATERAL VIEW INLINE(entities.media) mytable}}
> I get the exception attached as partial_log.log, however, if I add
> {{WHERE entities.media IS NOT NULL}}
> it runs perfectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9618:

Attachment: HIVE-9618.2.patch.txt

Addressed comment & updated gold file

> Deduplicate RS keys for ptf/windowing
> -
>
> Key: HIVE-9618
> URL: https://issues.apache.org/jira/browse/HIVE-9618
> Project: Hive
>  Issue Type: Improvement
>  Components: PTF-Windowing
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9618.1.patch.txt, HIVE-9618.2.patch.txt
>
>
> Currently, partition spec containing same column for partition-by and 
> order-by makes duplicated key column for RS. For example, 
> {noformat}
> explain
> select p_mfgr, p_name, p_size, 
> rank() over (partition by p_mfgr order by p_name) as r, 
> dense_rank() over (partition by p_mfgr order by p_name) as dr, 
> sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
> unbounded preceding and current row)  as s1
> from noop(on noopwithmap(on noop(on part 
> partition by p_mfgr 
> order by p_mfgr, p_name
> )))
> {noformat}
> "partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns 
> like below
> {noformat}
> Reduce Output Operator
> key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
> (type: string)
> sort order: +++
> Map-reduce partition columns: p_mfgr (type: string)
> value expressions: p_size (type: int), p_retailprice (type: double)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9486) Use session classloader instead of application loader

2015-02-09 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313515#comment-14313515
 ] 

Navis commented on HIVE-9486:
-

[~szehon] I've considered to use 'Utilities.getSessionSpecifiedClassLoader' but 
it seemed better to have one in common module using JavaUtils.getClassLoader() 
which is safe to call without hive-exec or other modules. We modifies 
SessionState.HiveConf.ClassLoader and thread context loader altogether (at 
least in hive) and it would be the same. Better idea?

> Use session classloader instead of application loader
> -
>
> Key: HIVE-9486
> URL: https://issues.apache.org/jira/browse/HIVE-9486
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-9486.1.patch.txt, HIVE-9486.2.patch.txt
>
>
> From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html
> Looks reasonable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls

2015-02-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

Attachment: HIVE-9507.3.patch.txt

> Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
> 
>
> Key: HIVE-9507
> URL: https://issues.apache.org/jira/browse/HIVE-9507
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, UDF
>Affects Versions: 0.14.0
> Environment: hdp 2.2
> Windows server 2012 R2 64-bit
>Reporter: Moustafa Aboul Atta
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, 
> HIVE-9507.3.patch.txt, parial_log.log
>
>
> I have tweets stored with avro on hdfs with the default twitter status 
> (tweet) schema.
> There's an object called "entities" that contains arrays of structs.
> When I run
>  
> {{SELECT mytable.*}}
> {{FROM tweets}}
> {{LATERAL VIEW INLINE(entities.media) mytable}}
> I get the exception attached as partial_log.log, however, if I add
> {{WHERE entities.media IS NOT NULL}}
> it runs perfectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9618:

Attachment: HIVE-9618.1.patch.txt

> Deduplicate RS keys for ptf/windowing
> -
>
> Key: HIVE-9618
> URL: https://issues.apache.org/jira/browse/HIVE-9618
> Project: Hive
>  Issue Type: Improvement
>  Components: PTF-Windowing
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9618.1.patch.txt
>
>
> Currently, partition spec containing same column for partition-by and 
> order-by makes duplicated key column for RS. For example, 
> {noformat}
> explain
> select p_mfgr, p_name, p_size, 
> rank() over (partition by p_mfgr order by p_name) as r, 
> dense_rank() over (partition by p_mfgr order by p_name) as dr, 
> sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
> unbounded preceding and current row)  as s1
> from noop(on noopwithmap(on noop(on part 
> partition by p_mfgr 
> order by p_mfgr, p_name
> )))
> {noformat}
> "partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns 
> like below
> {noformat}
> Reduce Output Operator
> key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
> (type: string)
> sort order: +++
> Map-reduce partition columns: p_mfgr (type: string)
> value expressions: p_size (type: int), p_retailprice (type: double)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9618:

Status: Patch Available  (was: Open)

> Deduplicate RS keys for ptf/windowing
> -
>
> Key: HIVE-9618
> URL: https://issues.apache.org/jira/browse/HIVE-9618
> Project: Hive
>  Issue Type: Improvement
>  Components: PTF-Windowing
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9618.1.patch.txt
>
>
> Currently, partition spec containing same column for partition-by and 
> order-by makes duplicated key column for RS. For example, 
> {noformat}
> explain
> select p_mfgr, p_name, p_size, 
> rank() over (partition by p_mfgr order by p_name) as r, 
> dense_rank() over (partition by p_mfgr order by p_name) as dr, 
> sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
> unbounded preceding and current row)  as s1
> from noop(on noopwithmap(on noop(on part 
> partition by p_mfgr 
> order by p_mfgr, p_name
> )))
> {noformat}
> "partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns 
> like below
> {noformat}
> Reduce Output Operator
> key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
> (type: string)
> sort order: +++
> Map-reduce partition columns: p_mfgr (type: string)
> value expressions: p_size (type: int), p_retailprice (type: double)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-08 Thread Navis (JIRA)
Navis created HIVE-9618:
---

 Summary: Deduplicate RS keys for ptf/windowing
 Key: HIVE-9618
 URL: https://issues.apache.org/jira/browse/HIVE-9618
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial


Currently, partition spec containing same column for partition-by and order-by 
makes duplicated key column for RS. For example, 
{noformat}
explain
select p_mfgr, p_name, p_size, 
rank() over (partition by p_mfgr order by p_name) as r, 
dense_rank() over (partition by p_mfgr order by p_name) as dr, 
sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
unbounded preceding and current row)  as s1
from noop(on noopwithmap(on noop(on part 
partition by p_mfgr 
order by p_mfgr, p_name
)))
{noformat}

"partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns like 
below
{noformat}
Reduce Output Operator
key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
(type: string)
sort order: +++
Map-reduce partition columns: p_mfgr (type: string)
value expressions: p_size (type: int), p_retailprice (type: double)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9228) Problem with subquery using windowing functions

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9228:

Attachment: HIVE-9228.3.patch.txt

Updated gold file

> Problem with subquery using windowing functions
> ---
>
> Key: HIVE-9228
> URL: https://issues.apache.org/jira/browse/HIVE-9228
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.13.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, 
> HIVE-9228.3.patch.txt, create_table_tab1.sql, tab1.csv
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> The following query with window functions failed. The internal query works 
> fine.
> select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 
> then 1 end ) over (partition by col1, col2) as col5, row_number() over 
> (partition by col1, col2 order by col4) as col6 from tab1) t;
> HIVE generates an execution plan with 2 jobs. 
> 1. The first job is to basically calculate window function for col5.  
> 2. The second job is to calculate window function for col6 and output.
> The plan says the first job outputs the columns (col1, col2, col3, col4) to a 
> tmp file since only these columns are used in later stage. While, the PTF 
> operator for the first job outputs (_wcol0, col1, col2, col3, col4) with 
> _wcol0 as the result of the window function even it's not used. 
> In the second job, the map operator still reads the 4 columns (col1, col2, 
> col3, col4) from the temp file using the plan. That causes the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9228) Problem with subquery using windowing functions

2015-02-08 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311850#comment-14311850
 ] 

Navis commented on HIVE-9228:
-

[~aihuaxu] Sorry for my breaking in on this issue. I've been working on codes 
around CP for other issues and not wanted others waste time to understand 
complicated PTF operation. I think the fix is almost done. Sorry again.

> Problem with subquery using windowing functions
> ---
>
> Key: HIVE-9228
> URL: https://issues.apache.org/jira/browse/HIVE-9228
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.13.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, 
> create_table_tab1.sql, tab1.csv
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> The following query with window functions failed. The internal query works 
> fine.
> select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 
> then 1 end ) over (partition by col1, col2) as col5, row_number() over 
> (partition by col1, col2 order by col4) as col6 from tab1) t;
> HIVE generates an execution plan with 2 jobs. 
> 1. The first job is to basically calculate window function for col5.  
> 2. The second job is to calculate window function for col6 and output.
> The plan says the first job outputs the columns (col1, col2, col3, col4) to a 
> tmp file since only these columns are used in later stage. While, the PTF 
> operator for the first job outputs (_wcol0, col1, col2, col3, col4) with 
> _wcol0 as the result of the window function even it's not used. 
> In the second job, the map operator still reads the 4 columns (col1, col2, 
> col3, col4) from the temp file using the plan. That causes the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9615) Provide limit context for storage handlers

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9615:

Description: Propagate limit context generated from GlobalLimitOptimizer to 
storage handlers.  (was: Propagate limit context generated from 
GlobalLimitOptimizer to strorage handlers.)

> Provide limit context for storage handlers
> --
>
> Key: HIVE-9615
> URL: https://issues.apache.org/jira/browse/HIVE-9615
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9615.1.patch.txt
>
>
> Propagate limit context generated from GlobalLimitOptimizer to storage 
> handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9615) Provide limit context for storage handlers

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9615:

Attachment: HIVE-9615.1.patch.txt

Old patch found from git stash

> Provide limit context for storage handlers
> --
>
> Key: HIVE-9615
> URL: https://issues.apache.org/jira/browse/HIVE-9615
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9615.1.patch.txt
>
>
> Propagate limit context generated from GlobalLimitOptimizer to strorage 
> handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9615) Provide limit context for storage handlers

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9615:

Status: Patch Available  (was: Open)

> Provide limit context for storage handlers
> --
>
> Key: HIVE-9615
> URL: https://issues.apache.org/jira/browse/HIVE-9615
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9615.1.patch.txt
>
>
> Propagate limit context generated from GlobalLimitOptimizer to strorage 
> handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9615) Provide limit context for storage handlers

2015-02-08 Thread Navis (JIRA)
Navis created HIVE-9615:
---

 Summary: Provide limit context for storage handlers
 Key: HIVE-9615
 URL: https://issues.apache.org/jira/browse/HIVE-9615
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Navis
Assignee: Navis
Priority: Trivial


Propagate limit context generated from GlobalLimitOptimizer to strorage 
handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-3050) JDBC should provide metadata for columns whether a column is a partition column or not

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3050:

Attachment: HIVE-3050.1.patch.txt

> JDBC should provide metadata for columns whether a column is a partition 
> column or not
> --
>
> Key: HIVE-3050
> URL: https://issues.apache.org/jira/browse/HIVE-3050
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3050.1.patch.txt
>
>
> Trivial request from UI developers. 
> {code}
> DatabaseMetaData databaseMetaData = connection.getMetaData();
> ResultSet rs = databaseMetaData.getColumns(null, null, "tableName", null);
> 
> boolean partitionKey = rs.getBoolean("IS_PARTITION_COLUMN");
> {code}
> It's not JDBC standard column but seemed to be useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9499:

Attachment: HIVE-9499.3.patch.txt

Rebased to trunk

> hive.limit.query.max.table.partition makes queries fail on non-partitioned 
> tables
> -
>
> Key: HIVE-9499
> URL: https://issues.apache.org/jira/browse/HIVE-9499
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Alexander Kasper
>Assignee: Navis
> Attachments: HIVE-9499.1.patch.txt, HIVE-9499.2.patch.txt, 
> HIVE-9499.3.patch.txt
>
>
> If you use hive.limit.query.max.table.partition to limit the amount of 
> partitions that can be queried it makes queries on non-partitioned tables 
> fail.
> Example:
> {noformat}
> CREATE TABLE tmp(test INT);
> SELECT COUNT(*) FROM TMP; -- works fine
> SET hive.limit.query.max.table.partition=20;
> SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
> SET hive.limit.query.max.table.partition=-1;
> SELECT COUNT(*) FROM TMP; -- works fine again
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

Attachment: HIVE-9507.2.patch.txt

Reattaching for test

> Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
> 
>
> Key: HIVE-9507
> URL: https://issues.apache.org/jira/browse/HIVE-9507
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, UDF
>Affects Versions: 0.14.0
> Environment: hdp 2.2
> Windows server 2012 R2 64-bit
>Reporter: Moustafa Aboul Atta
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, 
> parial_log.log
>
>
> I have tweets stored with avro on hdfs with the default twitter status 
> (tweet) schema.
> There's an object called "entities" that contains arrays of structs.
> When I run
>  
> {{SELECT mytable.*}}
> {{FROM tweets}}
> {{LATERAL VIEW INLINE(entities.media) mytable}}
> I get the exception attached as partial_log.log, however, if I add
> {{WHERE entities.media IS NOT NULL}}
> it runs perfectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9513) NULL POINTER EXCEPTION

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9513:

Attachment: HIVE-9513.2.patch.txt

> NULL POINTER EXCEPTION
> --
>
> Key: HIVE-9513
> URL: https://issues.apache.org/jira/browse/HIVE-9513
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.13.1
>Reporter: ErwanMAS
>Assignee: Navis
> Attachments: HIVE-9513.1.patch.txt, HIVE-9513.2.patch.txt
>
>
> NPE duting parsing  of :
> {noformat}
> select * from (
>  select * from ( select 1 as id , "foo" as str_1 from staging.dual ) f
>   union   all
>  select * from ( select 2 as id , "bar" as str_2 from staging.dual ) g
> ) e ;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

Attachment: (was: HIVE-2573.14.patch.txt)

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
> HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
> HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
> HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, 
> HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, 
> HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

Attachment: HIVE-2573.14.patch.txt

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
> HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
> HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
> HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, 
> HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, 
> HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

Attachment: HIVE-2573.14.patch.txt

Forgot this for a long time. Rebased to trunk.

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
> HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
> HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
> HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, 
> HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, 
> HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count

2015-02-05 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308401#comment-14308401
 ] 

Navis commented on HIVE-6099:
-

+1

> Multi insert does not work properly with distinct count
> ---
>
> Key: HIVE-6099
> URL: https://issues.apache.org/jira/browse/HIVE-6099
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0
>Reporter: Pavan Gadam Manohar
>Assignee: Ashutosh Chauhan
>  Labels: count, distinct, insert, multi-insert
> Attachments: HIVE-6099.1.patch, HIVE-6099.2.patch, HIVE-6099.3.patch, 
> HIVE-6099.4.patch, HIVE-6099.patch, explain_hive_0.10.0.txt, 
> with_disabled.txt, with_enabled.txt
>
>
> Need 2 rows to reproduce this Bug. Here are the steps.
> Step 1) Create a table Table_A
> CREATE EXTERNAL TABLE Table_A
> (
> user string
> , type int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//Table_A';
> Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 
> 111 and 123. Insert 2 records into the table created above.
> select * from  Table_A;
> hive>  select * from table_a;
> OK
> tommy   123 2013-12-02
> tommy   111 2013-12-02
> Step 3) Create 2 destination tables to simulate multi-insert.
> CREATE EXTERNAL TABLE dest_Table_A
> (
> p_date string
> , Distinct_Users int
> , Type111Users int
> , Type123Users int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//dest_Table_A';
>  
> CREATE EXTERNAL TABLE dest_Table_B
> (
> p_date string
> , Distinct_Users int
> , Type111Users int
> , Type123Users int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//dest_Table_B';
> Step 4) Multi insert statement
> from Table_A a
> INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02')
> select a.dt
> ,count(distinct a.user) as AllDist
> ,count(distinct case when a.type = 111 then a.user else null end) as 
> Type111User
> ,count(distinct case when a.type != 111 then a.user else null end) as 
> Type123User
> group by a.dt
>  
> INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02')
> select a.dt
> ,count(distinct a.user) as AllDist
> ,count(distinct case when a.type = 111 then a.user else null end) as 
> Type111User
> ,count(distinct case when a.type != 111 then a.user else null end) as 
> Type123User
> group by a.dt
> ;
>  
> Step 5) Verify results.
> hive>  select * from dest_table_a;
> OK
> 2013-12-02  2   1   1   2013-12-02
> Time taken: 0.116 seconds
> hive>  select * from dest_table_b;
> OK
> 2013-12-02  2   1   1   2013-12-02
> Time taken: 0.13 seconds
> Conclusion: Hive gives a count of 2 for distinct users although there is 
> only one distinct user. After trying many datasets observed that Hive is 
> doing Type111Users + Typoe123Users = DistinctUsers which is wrong.
> hive> select count(distinct a.user) from table_a a;
> Gives:
> Total MapReduce CPU Time Spent: 4 seconds 350 msec
> OK
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-04 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14306641#comment-14306641
 ] 

Navis commented on HIVE-9545:
-

[~ashutoshc] Could you review this? Simple changes of method invocation to 
reflection.

> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
>Assignee: Navis
> Attachments: HIVE-9545.1.patch.txt
>
>
>  NO PRECOMMIT TESTS 
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count

2015-02-03 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304657#comment-14304657
 ] 

Navis commented on HIVE-6099:
-

It's introduced to generated distinct keys for this optimization and seemed not 
used by other codes. The optimization seemed working with single common 
distinct column, but I think the overhead for it overrides the good part (and 
hard to read). But.. let's see the result of test.

> Multi insert does not work properly with distinct count
> ---
>
> Key: HIVE-6099
> URL: https://issues.apache.org/jira/browse/HIVE-6099
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0
>Reporter: Pavan Gadam Manohar
>Assignee: Ashutosh Chauhan
>  Labels: count, distinct, insert, multi-insert
> Attachments: HIVE-6099.1.patch, HIVE-6099.2.patch, HIVE-6099.patch, 
> explain_hive_0.10.0.txt, with_disabled.txt, with_enabled.txt
>
>
> Need 2 rows to reproduce this Bug. Here are the steps.
> Step 1) Create a table Table_A
> CREATE EXTERNAL TABLE Table_A
> (
> user string
> , type int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//Table_A';
> Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 
> 111 and 123. Insert 2 records into the table created above.
> select * from  Table_A;
> hive>  select * from table_a;
> OK
> tommy   123 2013-12-02
> tommy   111 2013-12-02
> Step 3) Create 2 destination tables to simulate multi-insert.
> CREATE EXTERNAL TABLE dest_Table_A
> (
> p_date string
> , Distinct_Users int
> , Type111Users int
> , Type123Users int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//dest_Table_A';
>  
> CREATE EXTERNAL TABLE dest_Table_B
> (
> p_date string
> , Distinct_Users int
> , Type111Users int
> , Type123Users int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//dest_Table_B';
> Step 4) Multi insert statement
> from Table_A a
> INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02')
> select a.dt
> ,count(distinct a.user) as AllDist
> ,count(distinct case when a.type = 111 then a.user else null end) as 
> Type111User
> ,count(distinct case when a.type != 111 then a.user else null end) as 
> Type123User
> group by a.dt
>  
> INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02')
> select a.dt
> ,count(distinct a.user) as AllDist
> ,count(distinct case when a.type = 111 then a.user else null end) as 
> Type111User
> ,count(distinct case when a.type != 111 then a.user else null end) as 
> Type123User
> group by a.dt
> ;
>  
> Step 5) Verify results.
> hive>  select * from dest_table_a;
> OK
> 2013-12-02  2   1   1   2013-12-02
> Time taken: 0.116 seconds
> hive>  select * from dest_table_b;
> OK
> 2013-12-02  2   1   1   2013-12-02
> Time taken: 0.13 seconds
> Conclusion: Hive gives a count of 2 for distinct users although there is 
> only one distinct user. After trying many datasets observed that Hive is 
> doing Type111Users + Typoe123Users = DistinctUsers which is wrong.
> hive> select count(distinct a.user) from table_a a;
> Gives:
> Total MapReduce CPU Time Spent: 4 seconds 350 msec
> OK
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count

2015-02-03 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304503#comment-14304503
 ] 

Navis commented on HIVE-6099:
-

[~ashutoshc] Good! I've leaved some comments in rb. I think we are purging the 
most complicated parts in GroupByOperator.

> Multi insert does not work properly with distinct count
> ---
>
> Key: HIVE-6099
> URL: https://issues.apache.org/jira/browse/HIVE-6099
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0
>Reporter: Pavan Gadam Manohar
>Assignee: Ashutosh Chauhan
>  Labels: count, distinct, insert, multi-insert
> Attachments: HIVE-6099.1.patch, HIVE-6099.patch, 
> explain_hive_0.10.0.txt, with_disabled.txt, with_enabled.txt
>
>
> Need 2 rows to reproduce this Bug. Here are the steps.
> Step 1) Create a table Table_A
> CREATE EXTERNAL TABLE Table_A
> (
> user string
> , type int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//Table_A';
> Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 
> 111 and 123. Insert 2 records into the table created above.
> select * from  Table_A;
> hive>  select * from table_a;
> OK
> tommy   123 2013-12-02
> tommy   111 2013-12-02
> Step 3) Create 2 destination tables to simulate multi-insert.
> CREATE EXTERNAL TABLE dest_Table_A
> (
> p_date string
> , Distinct_Users int
> , Type111Users int
> , Type123Users int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//dest_Table_A';
>  
> CREATE EXTERNAL TABLE dest_Table_B
> (
> p_date string
> , Distinct_Users int
> , Type111Users int
> , Type123Users int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//dest_Table_B';
> Step 4) Multi insert statement
> from Table_A a
> INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02')
> select a.dt
> ,count(distinct a.user) as AllDist
> ,count(distinct case when a.type = 111 then a.user else null end) as 
> Type111User
> ,count(distinct case when a.type != 111 then a.user else null end) as 
> Type123User
> group by a.dt
>  
> INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02')
> select a.dt
> ,count(distinct a.user) as AllDist
> ,count(distinct case when a.type = 111 then a.user else null end) as 
> Type111User
> ,count(distinct case when a.type != 111 then a.user else null end) as 
> Type123User
> group by a.dt
> ;
>  
> Step 5) Verify results.
> hive>  select * from dest_table_a;
> OK
> 2013-12-02  2   1   1   2013-12-02
> Time taken: 0.116 seconds
> hive>  select * from dest_table_b;
> OK
> 2013-12-02  2   1   1   2013-12-02
> Time taken: 0.13 seconds
> Conclusion: Hive gives a count of 2 for distinct users although there is 
> only one distinct user. After trying many datasets observed that Hive is 
> doing Type111Users + Typoe123Users = DistinctUsers which is wrong.
> hive> select count(distinct a.user) from table_a a;
> Gives:
> Total MapReduce CPU Time Spent: 4 seconds 350 msec
> OK
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9397) SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS

2015-02-03 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9397:

Attachment: HIVE-9397.3.patch.txt

Updated result & fixed further more(distinct_stats was fall back to normal plan 
by exception making struct OI)

> SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
> 
>
> Key: HIVE-9397
> URL: https://issues.apache.org/jira/browse/HIVE-9397
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 0.14.0, 0.15.0
>Reporter: Damien Carol
>Assignee: Navis
> Attachments: HIVE-9397.1.patch.txt, HIVE-9397.2.patch.txt, 
> HIVE-9397.3.patch.txt
>
>
> These queries produce an error :
> {code:sql}
> DROP TABLE IF EXISTS foo;
> CREATE TABLE foo (id int) STORED AS ORC;
> INSERT INTO TABLE foo VALUES (1);
> INSERT INTO TABLE foo VALUES (2);
> INSERT INTO TABLE foo VALUES (3);
> INSERT INTO TABLE foo VALUES (4);
> INSERT INTO TABLE foo VALUES (5);
> SELECT max(id) FROM foo;
> ANALYZE TABLE foo COMPUTE STATISTICS FOR COLUMNS id;
> SELECT max(id) FROM foo;
> {code}
> The last query throws {{org.apache.hive.service.cli.HiveSQLException}}
> {noformat}
> 0: jdbc:hive2://nc-h04:1/casino> SELECT max(id) FROM foo;
> +-+--+
> | _c0 |
> +-+--+
> org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException
> 0: jdbc:hive2://nc-h04:1/casino>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9566) HiveServer2 fails to start with NullPointerException

2015-02-03 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304392#comment-14304392
 ] 

Navis commented on HIVE-9566:
-

+1

> HiveServer2 fails to start with NullPointerException
> 
>
> Key: HIVE-9566
> URL: https://issues.apache.org/jira/browse/HIVE-9566
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0, 0.14.0, 0.13.1
>Reporter: Na Yang
>Assignee: Na Yang
> Attachments: HIVE-9566.patch
>
>
> hiveserver2 uses embedded metastore with default hive-site.xml configuration. 
> I use "hive --stop --service hiveserver2" command to stop the running 
> hiveserver2 process and then use "hive --start --service hiveserver2" command 
> to start the hiveserver2 service. I see the following exception in the 
> hive.log file
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:104)
> at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:138)
> at 
> org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:171)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9553) Fix log-line in Partition Pruner

2015-02-03 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9553:

   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Mithun Radhakrishnan.

> Fix log-line in Partition Pruner
> 
>
> Key: HIVE-9553
> URL: https://issues.apache.org/jira/browse/HIVE-9553
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>Priority: Trivial
> Fix For: 1.2.0
>
> Attachments: HIVE-9553.1.patch
>
>
> Minor issue in logging the prune-expression in the PartitionPruner:
> {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
> LOG.trace("prune Expression = " + prunerExpr == null ? "" : prunerExpr);
> {code}
> Given the operator precedence order, this should read:
> {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
> LOG.trace("prune Expression = " + (prunerExpr == null ? "" : prunerExpr));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Attachment: HIVE-9545.1.patch.txt

> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
>Assignee: Navis
> Attachments: HIVE-9545.1.patch.txt
>
>
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Description: 

With the use of IBM JVM environment :
[root@dorado-vm2 hive]# java -version
java version "1.7.0"
Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20141017_217728 (JIT enabled, AOT enabled).

The build failed on
 [INFO] Hive Query Language  FAILURE [ 50.053 s]
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-exec: Compilation failure: Compilation failure:
[ERROR] 
/home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
 package com.sun.management does not exist.

HOWTO : 
#git clone -b branch-0.14 https://github.com/apache/hive.git
#cd hive
#mvn  install -DskipTests -Phadoop-2












  was:
With the use of IBM JVM environment :
[root@dorado-vm2 hive]# java -version
java version "1.7.0"
Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20141017_217728 (JIT enabled, AOT enabled).

The build failed on
 [INFO] Hive Query Language  FAILURE [ 50.053 s]
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-exec: Compilation failure: Compilation failure:
[ERROR] 
/home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
 package com.sun.management does not exist.

HOWTO : 
#git clone -b branch-0.14 https://github.com/apache/hive.git
#cd hive
#mvn  install -DskipTests -Phadoop-2













> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
>Assignee: Navis
> Attachments: HIVE-9545.1.patch.txt
>
>
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Description: 
 NO PRECOMMIT TESTS 

With the use of IBM JVM environment :
[root@dorado-vm2 hive]# java -version
java version "1.7.0"
Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20141017_217728 (JIT enabled, AOT enabled).

The build failed on
 [INFO] Hive Query Language  FAILURE [ 50.053 s]
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-exec: Compilation failure: Compilation failure:
[ERROR] 
/home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
 package com.sun.management does not exist.

HOWTO : 
#git clone -b branch-0.14 https://github.com/apache/hive.git
#cd hive
#mvn  install -DskipTests -Phadoop-2












  was:

With the use of IBM JVM environment :
[root@dorado-vm2 hive]# java -version
java version "1.7.0"
Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
20141017_217728 (JIT enabled, AOT enabled).

The build failed on
 [INFO] Hive Query Language  FAILURE [ 50.053 s]
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-exec: Compilation failure: Compilation failure:
[ERROR] 
/home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
 package com.sun.management does not exist.

HOWTO : 
#git clone -b branch-0.14 https://github.com/apache/hive.git
#cd hive
#mvn  install -DskipTests -Phadoop-2













> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
>Assignee: Navis
> Attachments: HIVE-9545.1.patch.txt
>
>
>  NO PRECOMMIT TESTS 
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Attachment: (was: HIVE-9495.1.patch.txt)

> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
>Assignee: Navis
> Attachments: HIVE-9545.1.patch.txt
>
>
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Assignee: Navis
  Status: Patch Available  (was: Open)

> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
>Assignee: Navis
> Attachments: HIVE-9495.1.patch.txt
>
>
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9545:

Attachment: HIVE-9495.1.patch.txt

> Build FAILURE with IBM JVM 
> ---
>
> Key: HIVE-9545
> URL: https://issues.apache.org/jira/browse/HIVE-9545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment:  mvn -version
> Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 
> 2014-08-11T22:58:10+02:00)
> Maven home: /opt/apache-maven-3.2.3
> Java version: 1.7.0, vendor: IBM Corporation
> Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", 
> family: "unix"
>Reporter: pascal oliva
> Attachments: HIVE-9495.1.patch.txt
>
>
> With the use of IBM JVM environment :
> [root@dorado-vm2 hive]# java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2))
> IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 
> 20141017_217728 (JIT enabled, AOT enabled).
> The build failed on
>  [INFO] Hive Query Language  FAILURE [ 50.053 
> s]
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-exec: Compilation failure: Compilation failure:
> [ERROR] 
> /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26]
>  package com.sun.management does not exist.
> HOWTO : 
> #git clone -b branch-0.14 https://github.com/apache/hive.git
> #cd hive
> #mvn  install -DskipTests -Phadoop-2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9397) SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS

2015-02-02 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302848#comment-14302848
 ] 

Navis commented on HIVE-9397:
-

Now OIs are acquired directly from row schema of final GBY operator. And also 
I've fixed double to float type casting, making identical result between 
stat-optimized and not. 
It would be possible to extend StatsOptimizer to accept queries like "select 
min(x)+max(x) from tbl" but seemed better to be done in following issue. 

> SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
> 
>
> Key: HIVE-9397
> URL: https://issues.apache.org/jira/browse/HIVE-9397
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 0.14.0, 0.15.0
>Reporter: Damien Carol
>Assignee: Navis
> Attachments: HIVE-9397.1.patch.txt, HIVE-9397.2.patch.txt
>
>
> These queries produce an error :
> {code:sql}
> DROP TABLE IF EXISTS foo;
> CREATE TABLE foo (id int) STORED AS ORC;
> INSERT INTO TABLE foo VALUES (1);
> INSERT INTO TABLE foo VALUES (2);
> INSERT INTO TABLE foo VALUES (3);
> INSERT INTO TABLE foo VALUES (4);
> INSERT INTO TABLE foo VALUES (5);
> SELECT max(id) FROM foo;
> ANALYZE TABLE foo COMPUTE STATISTICS FOR COLUMNS id;
> SELECT max(id) FROM foo;
> {code}
> The last query throws {{org.apache.hive.service.cli.HiveSQLException}}
> {noformat}
> 0: jdbc:hive2://nc-h04:1/casino> SELECT max(id) FROM foo;
> +-+--+
> | _c0 |
> +-+--+
> org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException
> 0: jdbc:hive2://nc-h04:1/casino>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9397) SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9397:

Attachment: HIVE-9397.2.patch.txt

Addressed comments & fixed double sub-type

> SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
> 
>
> Key: HIVE-9397
> URL: https://issues.apache.org/jira/browse/HIVE-9397
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 0.14.0, 0.15.0
>Reporter: Damien Carol
>Assignee: Navis
> Attachments: HIVE-9397.1.patch.txt, HIVE-9397.2.patch.txt
>
>
> These queries produce an error :
> {code:sql}
> DROP TABLE IF EXISTS foo;
> CREATE TABLE foo (id int) STORED AS ORC;
> INSERT INTO TABLE foo VALUES (1);
> INSERT INTO TABLE foo VALUES (2);
> INSERT INTO TABLE foo VALUES (3);
> INSERT INTO TABLE foo VALUES (4);
> INSERT INTO TABLE foo VALUES (5);
> SELECT max(id) FROM foo;
> ANALYZE TABLE foo COMPUTE STATISTICS FOR COLUMNS id;
> SELECT max(id) FROM foo;
> {code}
> The last query throws {{org.apache.hive.service.cli.HiveSQLException}}
> {noformat}
> 0: jdbc:hive2://nc-h04:1/casino> SELECT max(id) FROM foo;
> +-+--+
> | _c0 |
> +-+--+
> org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException
> 0: jdbc:hive2://nc-h04:1/casino>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9528) SemanticException: Ambiguous column reference

2015-02-02 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis resolved HIVE-9528.
-
Resolution: Not a Problem

> SemanticException: Ambiguous column reference
> -
>
> Key: HIVE-9528
> URL: https://issues.apache.org/jira/browse/HIVE-9528
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Yongzhi Chen
>Assignee: Navis
>
> When running the following query:
> {code}
> SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select  *  from 
> sim a join sim2 b on a.simstr=b.simstr) app
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10007]: Ambiguous column reference simstr in app (state=42000,code=10007)
> {code}
> This query works fine in hive 0.10
> In the apache trunk, following workaround will work:
> {code}
> SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim 
> a join sim2 b on a.simstr=b.simstr) app;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9528) SemanticException: Ambiguous column reference

2015-02-02 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302768#comment-14302768
 ] 

Navis commented on HIVE-9528:
-

No, it's HIVE-7733. I've almost forgot the context of it but probably it was 
about enforcing unique column names in the final stage of subquery which was 
checked when generating select operator before of it.

> SemanticException: Ambiguous column reference
> -
>
> Key: HIVE-9528
> URL: https://issues.apache.org/jira/browse/HIVE-9528
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Yongzhi Chen
>Assignee: Navis
>
> When running the following query:
> {code}
> SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select  *  from 
> sim a join sim2 b on a.simstr=b.simstr) app
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10007]: Ambiguous column reference simstr in app (state=42000,code=10007)
> {code}
> This query works fine in hive 0.10
> In the apache trunk, following workaround will work:
> {code}
> SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim 
> a join sim2 b on a.simstr=b.simstr) app;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9553) Fix log-line in Partition Pruner

2015-02-02 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302756#comment-14302756
 ] 

Navis commented on HIVE-9553:
-

+1

> Fix log-line in Partition Pruner
> 
>
> Key: HIVE-9553
> URL: https://issues.apache.org/jira/browse/HIVE-9553
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>Priority: Trivial
> Attachments: HIVE-9553.1.patch
>
>
> Minor issue in logging the prune-expression in the PartitionPruner:
> {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
> LOG.trace("prune Expression = " + prunerExpr == null ? "" : prunerExpr);
> {code}
> Given the operator precedence order, this should read:
> {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE}
> LOG.trace("prune Expression = " + (prunerExpr == null ? "" : prunerExpr));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count

2015-02-02 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302613#comment-14302613
 ] 

Navis commented on HIVE-6099:
-

[~ashutoshc] Could we remove this optimization? I'm sure this is not valid from 
the start.

> Multi insert does not work properly with distinct count
> ---
>
> Key: HIVE-6099
> URL: https://issues.apache.org/jira/browse/HIVE-6099
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0
>Reporter: Pavan Gadam Manohar
>Assignee: Navis
>  Labels: count, distinct, insert, multi-insert
> Attachments: explain_hive_0.10.0.txt, with_disabled.txt, 
> with_enabled.txt
>
>
> Need 2 rows to reproduce this Bug. Here are the steps.
> Step 1) Create a table Table_A
> CREATE EXTERNAL TABLE Table_A
> (
> user string
> , type int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//Table_A';
> Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 
> 111 and 123. Insert 2 records into the table created above.
> select * from  Table_A;
> hive>  select * from table_a;
> OK
> tommy   123 2013-12-02
> tommy   111 2013-12-02
> Step 3) Create 2 destination tables to simulate multi-insert.
> CREATE EXTERNAL TABLE dest_Table_A
> (
> p_date string
> , Distinct_Users int
> , Type111Users int
> , Type123Users int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//dest_Table_A';
>  
> CREATE EXTERNAL TABLE dest_Table_B
> (
> p_date string
> , Distinct_Users int
> , Type111Users int
> , Type123Users int
> )
> PARTITIONED BY (dt string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '|' 
>  STORED AS RCFILE
> LOCATION '/hive//dest_Table_B';
> Step 4) Multi insert statement
> from Table_A a
> INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02')
> select a.dt
> ,count(distinct a.user) as AllDist
> ,count(distinct case when a.type = 111 then a.user else null end) as 
> Type111User
> ,count(distinct case when a.type != 111 then a.user else null end) as 
> Type123User
> group by a.dt
>  
> INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02')
> select a.dt
> ,count(distinct a.user) as AllDist
> ,count(distinct case when a.type = 111 then a.user else null end) as 
> Type111User
> ,count(distinct case when a.type != 111 then a.user else null end) as 
> Type123User
> group by a.dt
> ;
>  
> Step 5) Verify results.
> hive>  select * from dest_table_a;
> OK
> 2013-12-02  2   1   1   2013-12-02
> Time taken: 0.116 seconds
> hive>  select * from dest_table_b;
> OK
> 2013-12-02  2   1   1   2013-12-02
> Time taken: 0.13 seconds
> Conclusion: Hive gives a count of 2 for distinct users although there is 
> only one distinct user. After trying many datasets observed that Hive is 
> doing Type111Users + Typoe123Users = DistinctUsers which is wrong.
> hive> select count(distinct a.user) from table_a a;
> Gives:
> Total MapReduce CPU Time Spent: 4 seconds 350 msec
> OK
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9416) Get rid of Extract Operator

2015-02-01 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14300801#comment-14300801
 ] 

Navis commented on HIVE-9416:
-

+1

> Get rid of Extract Operator
> ---
>
> Key: HIVE-9416
> URL: https://issues.apache.org/jira/browse/HIVE-9416
> Project: Hive
>  Issue Type: Task
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-9416.1.patch, HIVE-9416.2.patch, HIVE-9416.3.patch, 
> HIVE-9416.4.patch, HIVE-9416.5.patch, HIVE-9416.6.patch, HIVE-9416.7.patch, 
> HIVE-9416.patch
>
>
> {{Extract Operator}} has been there for legacy reasons. But there is no 
> functionality it provides which cant be provided by {{Select Operator}} 
> Instead of having two operators, one being subset of another we should just 
> get rid of {{Extract}} and simplify our codebase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-9528) SemanticException: Ambiguous column reference

2015-02-01 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis reassigned HIVE-9528:
---

Assignee: Navis

> SemanticException: Ambiguous column reference
> -
>
> Key: HIVE-9528
> URL: https://issues.apache.org/jira/browse/HIVE-9528
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Yongzhi Chen
>Assignee: Navis
>
> When running the following query:
> {code}
> SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select  *  from 
> sim a join sim2 b on a.simstr=b.simstr) app
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10007]: Ambiguous column reference simstr in app (state=42000,code=10007)
> {code}
> This query works fine in hive 0.10
> In the apache trunk, following workaround will work:
> {code}
> SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim 
> a join sim2 b on a.simstr=b.simstr) app;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9528) SemanticException: Ambiguous column reference

2015-02-01 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14300800#comment-14300800
 ] 

Navis commented on HIVE-9528:
-

[~ychena], before HIVE-7733, column information was overwritten by last column 
with same name, which possibly making invalid result. Anyway, the query you've 
mentioned is not working in mysql either (works in psql, though). Can we 
resolve this as a not-problem?

> SemanticException: Ambiguous column reference
> -
>
> Key: HIVE-9528
> URL: https://issues.apache.org/jira/browse/HIVE-9528
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Yongzhi Chen
>
> When running the following query:
> {code}
> SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select  *  from 
> sim a join sim2 b on a.simstr=b.simstr) app
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10007]: Ambiguous column reference simstr in app (state=42000,code=10007)
> {code}
> This query works fine in hive 0.10
> In the apache trunk, following workaround will work:
> {code}
> SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim 
> a join sim2 b on a.simstr=b.simstr) app;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9499:

Attachment: HIVE-9499.2.patch.txt

fixed trivial bug in TableScanStatsRule

> hive.limit.query.max.table.partition makes queries fail on non-partitioned 
> tables
> -
>
> Key: HIVE-9499
> URL: https://issues.apache.org/jira/browse/HIVE-9499
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Alexander Kasper
>Assignee: Navis
> Attachments: HIVE-9499.1.patch.txt, HIVE-9499.2.patch.txt
>
>
> If you use hive.limit.query.max.table.partition to limit the amount of 
> partitions that can be queried it makes queries on non-partitioned tables 
> fail.
> Example:
> {noformat}
> CREATE TABLE tmp(test INT);
> SELECT COUNT(*) FROM TMP; -- works fine
> SET hive.limit.query.max.table.partition=20;
> SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
> SET hive.limit.query.max.table.partition=-1;
> SELECT COUNT(*) FROM TMP; -- works fine again
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9499:

Assignee: Navis
  Status: Patch Available  (was: Open)

> hive.limit.query.max.table.partition makes queries fail on non-partitioned 
> tables
> -
>
> Key: HIVE-9499
> URL: https://issues.apache.org/jira/browse/HIVE-9499
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Alexander Kasper
>Assignee: Navis
> Attachments: HIVE-9499.1.patch.txt
>
>
> If you use hive.limit.query.max.table.partition to limit the amount of 
> partitions that can be queried it makes queries on non-partitioned tables 
> fail.
> Example:
> {noformat}
> CREATE TABLE tmp(test INT);
> SELECT COUNT(*) FROM TMP; -- works fine
> SET hive.limit.query.max.table.partition=20;
> SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
> SET hive.limit.query.max.table.partition=-1;
> SELECT COUNT(*) FROM TMP; -- works fine again
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9499:

Attachment: HIVE-9499.1.patch.txt

> hive.limit.query.max.table.partition makes queries fail on non-partitioned 
> tables
> -
>
> Key: HIVE-9499
> URL: https://issues.apache.org/jira/browse/HIVE-9499
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Alexander Kasper
> Attachments: HIVE-9499.1.patch.txt
>
>
> If you use hive.limit.query.max.table.partition to limit the amount of 
> partitions that can be queried it makes queries on non-partitioned tables 
> fail.
> Example:
> {noformat}
> CREATE TABLE tmp(test INT);
> SELECT COUNT(*) FROM TMP; -- works fine
> SET hive.limit.query.max.table.partition=20;
> SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
> SET hive.limit.query.max.table.partition=-1;
> SELECT COUNT(*) FROM TMP; -- works fine again
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9499:

Description: 
If you use hive.limit.query.max.table.partition to limit the amount of 
partitions that can be queried it makes queries on non-partitioned tables fail.

Example:
{noformat}
CREATE TABLE tmp(test INT);
SELECT COUNT(*) FROM TMP; -- works fine
SET hive.limit.query.max.table.partition=20;
SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
SET hive.limit.query.max.table.partition=-1;
SELECT COUNT(*) FROM TMP; -- works fine again
{noformat}

  was:
If you use hive.limit.query.max.table.partition to limit the amount of 
partitions that can be queried it makes queries on non-partitioned tables fail.

Example:
CREATE TABLE tmp(test INT);
SELECT COUNT(*) FROM TMP; -- works fine
SET hive.limit.query.max.table.partition=20;
SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
SET hive.limit.query.max.table.partition=-1;
SELECT COUNT(*) FROM TMP; -- works fine again


> hive.limit.query.max.table.partition makes queries fail on non-partitioned 
> tables
> -
>
> Key: HIVE-9499
> URL: https://issues.apache.org/jira/browse/HIVE-9499
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Alexander Kasper
>
> If you use hive.limit.query.max.table.partition to limit the amount of 
> partitions that can be queried it makes queries on non-partitioned tables 
> fail.
> Example:
> {noformat}
> CREATE TABLE tmp(test INT);
> SELECT COUNT(*) FROM TMP; -- works fine
> SET hive.limit.query.max.table.partition=20;
> SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
> SET hive.limit.query.max.table.partition=-1;
> SELECT COUNT(*) FROM TMP; -- works fine again
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9513) NULL POINTER EXCEPTION

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9513:

Attachment: HIVE-9513.1.patch.txt

> NULL POINTER EXCEPTION
> --
>
> Key: HIVE-9513
> URL: https://issues.apache.org/jira/browse/HIVE-9513
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.13.1
>Reporter: ErwanMAS
> Attachments: HIVE-9513.1.patch.txt
>
>
> NPE duting parsing  of :
> {noformat}
> select * from (
>  select * from ( select 1 as id , "foo" as str_1 from staging.dual ) f
>   union   all
>  select * from ( select 2 as id , "bar" as str_2 from staging.dual ) g
> ) e ;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9513) NULL POINTER EXCEPTION

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9513:

Assignee: Navis
  Status: Patch Available  (was: Open)

> NULL POINTER EXCEPTION
> --
>
> Key: HIVE-9513
> URL: https://issues.apache.org/jira/browse/HIVE-9513
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.13.1
>Reporter: ErwanMAS
>Assignee: Navis
> Attachments: HIVE-9513.1.patch.txt
>
>
> NPE duting parsing  of :
> {noformat}
> select * from (
>  select * from ( select 1 as id , "foo" as str_1 from staging.dual ) f
>   union   all
>  select * from ( select 2 as id , "bar" as str_2 from staging.dual ) g
> ) e ;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

Assignee: Navis
  Status: Patch Available  (was: Open)

> Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
> 
>
> Key: HIVE-9507
> URL: https://issues.apache.org/jira/browse/HIVE-9507
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, UDF
>Affects Versions: 0.14.0
> Environment: hdp 2.2
> Windows server 2012 R2 64-bit
>Reporter: Moustafa Aboul Atta
>Assignee: Navis
> Attachments: HIVE-9507.1.patch.txt
>
>
> I have tweets stored with avro on hdfs with the default twitter status 
> (tweet) schema.
> There's an object called "entities" that contains arrays of structs.
> When I run
>  
> {{SELECT mytable.*}}
> {{FROM tweets}}
> {{LATERAL VIEW INLINE(entities.media) mytable}}
> I get the exception found hereunder, however if I add
> {{WHERE entities.media IS NOT NULL}}
> it runs perfectly.
> Here's the partial log:
> 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Status: Running (Executing on YARN cluster with App id 
> application_1422267635031_0618)
> 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: -/-
> 2015-01-29 10:15:02,526 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0/13   
> 2015-01-29 10:15:05,551 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0/13   
> 2015-01-29 10:15:08,722 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0/13   
> 2015-01-29 10:15:12,095 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0/13   
> 2015-01-29 10:15:12,354 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(108)) -  from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor>
> 2015-01-29 10:15:12,354 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+5)/13   
> 2015-01-29 10:15:12,557 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6)/13   
> 2015-01-29 10:15:15,691 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6)/13   
> 2015-01-29 10:15:18,892 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-1)/13
> 2015-01-29 10:15:19,094 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-3)/13
> 2015-01-29 10:15:19,304 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-5)/13
> 2015-01-29 10:15:19,507 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-6)/13
> 2015-01-29 10:15:22,641 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-6)/13
> 2015-01-29 10:15:24,704 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-8)/13
> 2015-01-29 10:15:27,735 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-8)/13
> 2015-01-29 10:15:30,957 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-8)/13
> 2015-01-29 10:15:34,095 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-8)/13
> 2015-01-29 10:15:35,138 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-9)/13
> 2015-01-29 10:15:36,503 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-10)/13   
> 2015-01-29 10:15:36,710 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-11)/13   
> 2015-01-29 10:15:37,971 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-12)/13   
> 2015-01-29 10:15:39,800 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-13)/13   
> 2015-01-29 10:15:41,175 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-14)/13   
> 2015-01-29 10:15:44,414 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-14)/13   
> 2015-01-29 10:15:45,447 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-15)/13   
> 2015-01-29 10:15:47,413 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-16)/13   
> 2015-01-29 10:15:47,618 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-17)/13   
> 2015-01-29 10:15:49,568 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-18)/13   
> 2015-01-29 10:15:51,099 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+0,-19)/13   
> 2015-01-29 10:15:51,331 ERROR SessionState 
> (SessionState.java:printError(833)) - Status: Failed
> 2015-01-29 10:15:51,417 ERROR SessionState 
> (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, 
> taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 
> failed, info=[

[jira] [Updated] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

Attachment: HIVE-9507.1.patch.txt

> Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
> 
>
> Key: HIVE-9507
> URL: https://issues.apache.org/jira/browse/HIVE-9507
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, UDF
>Affects Versions: 0.14.0
> Environment: hdp 2.2
> Windows server 2012 R2 64-bit
>Reporter: Moustafa Aboul Atta
> Attachments: HIVE-9507.1.patch.txt
>
>
> I have tweets stored with avro on hdfs with the default twitter status 
> (tweet) schema.
> There's an object called "entities" that contains arrays of structs.
> When I run
>  
> {{SELECT mytable.*}}
> {{FROM tweets}}
> {{LATERAL VIEW INLINE(entities.media) mytable}}
> I get the exception found hereunder, however if I add
> {{WHERE entities.media IS NOT NULL}}
> it runs perfectly.
> Here's the partial log:
> 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Status: Running (Executing on YARN cluster with App id 
> application_1422267635031_0618)
> 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: -/-
> 2015-01-29 10:15:02,526 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0/13   
> 2015-01-29 10:15:05,551 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0/13   
> 2015-01-29 10:15:08,722 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0/13   
> 2015-01-29 10:15:12,095 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0/13   
> 2015-01-29 10:15:12,354 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(108)) -  from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor>
> 2015-01-29 10:15:12,354 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+5)/13   
> 2015-01-29 10:15:12,557 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6)/13   
> 2015-01-29 10:15:15,691 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6)/13   
> 2015-01-29 10:15:18,892 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-1)/13
> 2015-01-29 10:15:19,094 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-3)/13
> 2015-01-29 10:15:19,304 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-5)/13
> 2015-01-29 10:15:19,507 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-6)/13
> 2015-01-29 10:15:22,641 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-6)/13
> 2015-01-29 10:15:24,704 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-8)/13
> 2015-01-29 10:15:27,735 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-8)/13
> 2015-01-29 10:15:30,957 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-8)/13
> 2015-01-29 10:15:34,095 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-8)/13
> 2015-01-29 10:15:35,138 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-9)/13
> 2015-01-29 10:15:36,503 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-10)/13   
> 2015-01-29 10:15:36,710 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-11)/13   
> 2015-01-29 10:15:37,971 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-12)/13   
> 2015-01-29 10:15:39,800 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-13)/13   
> 2015-01-29 10:15:41,175 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-14)/13   
> 2015-01-29 10:15:44,414 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-14)/13   
> 2015-01-29 10:15:45,447 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-15)/13   
> 2015-01-29 10:15:47,413 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-16)/13   
> 2015-01-29 10:15:47,618 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-17)/13   
> 2015-01-29 10:15:49,568 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+6,-18)/13   
> 2015-01-29 10:15:51,099 INFO  SessionState (SessionState.java:printInfo(824)) 
> - Map 1: 0(+0,-19)/13   
> 2015-01-29 10:15:51,331 ERROR SessionState 
> (SessionState.java:printError(833)) - Status: Failed
> 2015-01-29 10:15:51,417 ERROR SessionState 
> (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, 
> taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running task:java.lang.RuntimeExc

[jira] [Updated] (HIVE-9486) Use session classloader instead of application loader

2015-01-28 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9486:

Attachment: HIVE-9486.2.patch.txt

> Use session classloader instead of application loader
> -
>
> Key: HIVE-9486
> URL: https://issues.apache.org/jira/browse/HIVE-9486
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-9486.1.patch.txt, HIVE-9486.2.patch.txt
>
>
> From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html
> Looks reasonable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9228) Problem with subquery using windowing functions

2015-01-27 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14294595#comment-14294595
 ] 

Navis commented on HIVE-9228:
-

Yes, when PTF column is not selected, we should prune the function itself in 
PTF operator. But I thought it's trivial case not to select the column which 
was calculated with heavy cost. And select operator would be removed by 
IdentityProjectRemover if it's not needed. 
By the way, could you review HIVE-9138 first? It's hard to debug something on 
PTF without any explain result.

> Problem with subquery using windowing functions
> ---
>
> Key: HIVE-9228
> URL: https://issues.apache.org/jira/browse/HIVE-9228
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.13.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, 
> create_table_tab1.sql, tab1.csv
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> The following query with window functions failed. The internal query works 
> fine.
> select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 
> then 1 end ) over (partition by col1, col2) as col5, row_number() over 
> (partition by col1, col2 order by col4) as col6 from tab1) t;
> HIVE generates an execution plan with 2 jobs. 
> 1. The first job is to basically calculate window function for col5.  
> 2. The second job is to calculate window function for col6 and output.
> The plan says the first job outputs the columns (col1, col2, col3, col4) to a 
> tmp file since only these columns are used in later stage. While, the PTF 
> operator for the first job outputs (_wcol0, col1, col2, col3, col4) with 
> _wcol0 as the result of the window function even it's not used. 
> In the second job, the map operator still reads the 4 columns (col1, col2, 
> col3, col4) from the temp file using the plan. That causes the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9278) Cached expression feature broken in one case

2015-01-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9278:

Fix Version/s: 0.14.1

> Cached expression feature broken in one case
> 
>
> Key: HIVE-9278
> URL: https://issues.apache.org/jira/browse/HIVE-9278
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Navis
>Priority: Blocker
> Fix For: 0.15.0, 0.14.1, 1.0.0
>
> Attachments: HIVE-9278.1.patch.txt
>
>
> Different query result depending on whether hive.cache.expr.evaluation is 
> true or false.  When true, no query results are produced (this is wrong).
> The q file:
> {noformat}
> set hive.cache.expr.evaluation=true;
> CREATE TABLE cache_expr_repro (date_str STRING);
> LOAD DATA LOCAL INPATH '../../data/files/cache_expr_repro.txt' INTO TABLE 
> cache_expr_repro;
> SELECT MONTH(date_str) AS `mon`, CAST((MONTH(date_str) - 1) / 3 + 1 AS int) 
> AS `quarter`,   YEAR(date_str) AS `year` FROM cache_expr_repro WHERE 
> ((CAST((MONTH(date_str) - 1) / 3 + 1 AS int) = 1) AND (YEAR(date_str) = 
> 2015)) GROUP BY MONTH(date_str), CAST((MONTH(date_str) - 1) / 3 + 1 AS int),  
>  YEAR(date_str) ;
> {noformat}
> cache_expr_repro.txt
> {noformat}
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-01-01 00:00:00
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9459) Concat plus date functions appear to be broken in 0.14

2015-01-27 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14294584#comment-14294584
 ] 

Navis commented on HIVE-9459:
-

[~jdere] Yes, looks like same issue.

> Concat plus date functions appear to be broken in 0.14
> --
>
> Key: HIVE-9459
> URL: https://issues.apache.org/jira/browse/HIVE-9459
> Project: Hive
>  Issue Type: Bug
>Reporter: Nathan Lande
>
> In the below example I create year_month and month_year vars. These each 
> should be mm and mm integer strings but it appears as if hive is 
> calling the first function twice such that it is returning  and .
> hive> select
> > month(a.joined) month,
> > year(a.joined) year,
> > concat(cast(year(a.joined) as string),cast(month(a.joined) as string)) 
> year_month,
> > concat(cast(month(a.joined) as string),cast(year(a.joined) as string)) 
> month_year
> > from a limit 20;
> OK
> month yearyear_month  month_year
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> 7 20142014201477
> Time taken: 0.109 seconds, Fetched: 20 row(s)
> Other users appear to experience similar issues in this stack overflow: 
> http://stackoverflow.com/questions/27740866/convert-date-to-decimal-format-in-hive
>  .
> I tested this in 0.13 and 0.14 and it does not appear to be an issue in 0.13.
> I looked around and could not find a similar issue so hopefully this is not a 
> duplicate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9486) Use session classloader instead of application loader

2015-01-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9486:

Status: Patch Available  (was: Open)

> Use session classloader instead of application loader
> -
>
> Key: HIVE-9486
> URL: https://issues.apache.org/jira/browse/HIVE-9486
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-9486.1.patch.txt
>
>
> From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html
> Looks reasonable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9486) Use session classloader instead of application loader

2015-01-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9486:

Attachment: HIVE-9486.1.patch.txt

> Use session classloader instead of application loader
> -
>
> Key: HIVE-9486
> URL: https://issues.apache.org/jira/browse/HIVE-9486
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-9486.1.patch.txt
>
>
> From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html
> Looks reasonable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >