[jira] [Created] (HIVE-12388) getTables cannot get external tables
Navis created HIVE-12388: Summary: getTables cannot get external tables Key: HIVE-12388 URL: https://issues.apache.org/jira/browse/HIVE-12388 Project: Hive Issue Type: Bug Components: JDBC Reporter: Navis Assignee: Navis Priority: Critical By regression of HIVE-7575, external tables are not shown when "TABLE" type is specified as argument. I'm working on this. Sorry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12373) Interner should return identical map or list
Navis created HIVE-12373: Summary: Interner should return identical map or list Key: HIVE-12373 URL: https://issues.apache.org/jira/browse/HIVE-12373 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Currently, HiveStringUtils.intern(map/list) returns new instance of map or list. But it would break some usage style of code something like below (it's spark code in HiveMetastoreCatalog) {code} val serdeParameters = new java.util.HashMap[String, String]() serdeInfo.setParameters(serdeParameters) // these properties will be gone table.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) } p.storage.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) } {code} Luckily for spark, interner was not applied to released version of hive (1.2.0, 1.2.1) by mistake. But it would make problem in someday. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12183) JsonParser/Generator should be closed for resycle
Navis created HIVE-12183: Summary: JsonParser/Generator should be closed for resycle Key: HIVE-12183 URL: https://issues.apache.org/jira/browse/HIVE-12183 Project: Hive Issue Type: Bug Reporter: Navis Priority: Trivial -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11774) Show macro definition for desc function
Navis created HIVE-11774: Summary: Show macro definition for desc function Key: HIVE-11774 URL: https://issues.apache.org/jira/browse/HIVE-11774 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-11774.1.patch.txt Currently, desc function shows nothing for macro. It would be helpful if it shows the definition of it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11756) Avoid redundant key serialization in RS for distinct query
Navis created HIVE-11756: Summary: Avoid redundant key serialization in RS for distinct query Key: HIVE-11756 URL: https://issues.apache.org/jira/browse/HIVE-11756 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Currently hive serializes twice to know the length of distribution key for distinct queries. This introduces IndexedSerializer to avoid this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11754) Not reachable code parts in StatsUtils
Navis created HIVE-11754: Summary: Not reachable code parts in StatsUtils Key: HIVE-11754 URL: https://issues.apache.org/jira/browse/HIVE-11754 Project: Hive Issue Type: Task Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-11754.1.patch.txt No need to check "oi instanceof WritableConstantHiveCharObjectInspector" after "oi instanceof ConstantObjectInspector". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11752) Pre-materializing complex CTE queries
Navis created HIVE-11752: Summary: Pre-materializing complex CTE queries Key: HIVE-11752 URL: https://issues.apache.org/jira/browse/HIVE-11752 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Currently, hive regards CTE clauses as a simple alias to the query block, which makes redundant works if it's used multiple times in a query. This introduces a reference threshold for pre-materializing the CTE clause as a volatile table (which is not exists in any form of metastore and just accessible from QB). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11707) Implement "dump metastore"
Navis created HIVE-11707: Summary: Implement "dump metastore" Key: HIVE-11707 URL: https://issues.apache.org/jira/browse/HIVE-11707 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Navis Assignee: Navis Priority: Minor In projects, we've frequently met the need of copying existing metastore to other database (for other version of hive or other engines like impala, tajo, spark, etc.). RDBs support dumping data of metastore into series of SQLs but it's needed to be translated before apply if we uses different RDB which is time counsuming, error-prone work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11706) Implement "show create database"
Navis created HIVE-11706: Summary: Implement "show create database" Key: HIVE-11706 URL: https://issues.apache.org/jira/browse/HIVE-11706 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Navis Assignee: Navis Priority: Trivial HIVE-967 introduced "show create table". How about "show create database"? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11662) DP cannot be applied to external table which contains part-spec like directory
Navis created HIVE-11662: Summary: DP cannot be applied to external table which contains part-spec like directory Key: HIVE-11662 URL: https://issues.apache.org/jira/browse/HIVE-11662 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Some users want to use part-spec like directory name in their partitioned table locations, something like, {noformat} /something/warehouse/some_key=some_value {noformat} DP calculates additional partitions from full path, and makes exception something like, {noformat} Failed with exception Partition spec {some_key=some_value, part_key=part_value} contains non-partition columns FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11518) Provide interface to adjust required resource for tez tasks
Navis created HIVE-11518: Summary: Provide interface to adjust required resource for tez tasks Key: HIVE-11518 URL: https://issues.apache.org/jira/browse/HIVE-11518 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Resource requirements for each tasks are varied but currently it's fixed to one value(via hive.tez.container.size). It would be good to customize resource requirements appropriate to expected work. Suggested interface is quite simple. {code} public interface ResourceCalculator { Resource adjust(Resource resource, MapWork mapWork); Resource adjust(Resource resource, ReduceWork reduceWork); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner
Navis created HIVE-11515: Summary: Still some possible race condition in DynamicPartitionPruner Key: HIVE-11515 URL: https://issues.apache.org/jira/browse/HIVE-11515 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to reproduce but it seemed related to the fact that init() is called by thread-pool. With some delay in queue, events from fast tasks are arrived before init() is called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11506) Casting varchar/char type to string cannot be vectorized
Navis created HIVE-11506: Summary: Casting varchar/char type to string cannot be vectorized Key: HIVE-11506 URL: https://issues.apache.org/jira/browse/HIVE-11506 Project: Hive Issue Type: Improvement Components: Vectorization Reporter: Navis Assignee: Navis Priority: Trivial It's not defined in vectorization context. {code} explain select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order by x; {code} Mapper {noformat} 015-08-10 17:02:08,003 INFO [main]: physical.Vectorizer (Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: varchar(10) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11002) Memory leakage on unsafe aggregation path with empty input
Navis created HIVE-11002: Summary: Memory leakage on unsafe aggregation path with empty input Key: HIVE-11002 URL: https://issues.apache.org/jira/browse/HIVE-11002 Project: Hive Issue Type: Bug Components: SQL Reporter: Navis Assignee: Navis Priority: Minor Currently, unsafe-based hash is released on 'next' call but if input is empty, it would not be called ever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10890) Provide implementable engine selector
Navis created HIVE-10890: Summary: Provide implementable engine selector Key: HIVE-10890 URL: https://issues.apache.org/jira/browse/HIVE-10890 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Now hive supports three kind of engines. It would be good to have an automatic engine selector without setting explicitly engine for execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9806) Support partition locator for custom directory hierarchy
Navis created HIVE-9806: --- Summary: Support partition locator for custom directory hierarchy Key: HIVE-9806 URL: https://issues.apache.org/jira/browse/HIVE-9806 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Currently, relative partition directory should be same with partition name, which is not always applicable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP
[ https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9699: Attachment: HIVE-9699.2.patch.txt > Extend PTFs to provide referenced columns for CP > > > Key: HIVE-9699 > URL: https://issues.apache.org/jira/browse/HIVE-9699 > Project: Hive > Issue Type: Improvement > Components: PTF-Windowing >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9699.1.patch.txt, HIVE-9699.2.patch.txt > > > As described in HIVE-9341, If PTFs can provide referenced column names, > column pruner can use that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP
[ https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9699: Status: Patch Available (was: Open) > Extend PTFs to provide referenced columns for CP > > > Key: HIVE-9699 > URL: https://issues.apache.org/jira/browse/HIVE-9699 > Project: Hive > Issue Type: Improvement > Components: PTF-Windowing >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9699.1.patch.txt, HIVE-9699.2.patch.txt > > > As described in HIVE-9341, If PTFs can provide referenced column names, > column pruner can use that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP
[ https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9699: Status: Open (was: Patch Available) > Extend PTFs to provide referenced columns for CP > > > Key: HIVE-9699 > URL: https://issues.apache.org/jira/browse/HIVE-9699 > Project: Hive > Issue Type: Improvement > Components: PTF-Windowing >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9699.1.patch.txt > > > As described in HIVE-9341, If PTFs can provide referenced column names, > column pruner can use that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP
[ https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9699: Status: Patch Available (was: Open) > Extend PTFs to provide referenced columns for CP > > > Key: HIVE-9699 > URL: https://issues.apache.org/jira/browse/HIVE-9699 > Project: Hive > Issue Type: Improvement > Components: PTF-Windowing >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9699.1.patch.txt > > > As described in HIVE-9341, If PTFs can provide referenced column names, > column pruner can use that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9699) Extend PTFs to provide referenced columns for CP
[ https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9699: Attachment: HIVE-9699.1.patch.txt > Extend PTFs to provide referenced columns for CP > > > Key: HIVE-9699 > URL: https://issues.apache.org/jira/browse/HIVE-9699 > Project: Hive > Issue Type: Improvement > Components: PTF-Windowing >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9699.1.patch.txt > > > As described in HIVE-9341, If PTFs can provide referenced column names, > column pruner can use that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9699) Extend PTFs to provide referenced columns for CP
Navis created HIVE-9699: --- Summary: Extend PTFs to provide referenced columns for CP Key: HIVE-9699 URL: https://issues.apache.org/jira/browse/HIVE-9699 Project: Hive Issue Type: Improvement Components: PTF-Windowing Reporter: Navis Assignee: Navis Priority: Trivial As described in HIVE-9341, If PTFs can provide referenced column names, column pruner can use that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-2573) Create per-session function registry
[ https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2573: Resolution: Fixed Fix Version/s: 1.2.0 Status: Resolved (was: Patch Available) Committed to trunk, at last. Thanks Jason! > Create per-session function registry > - > > Key: HIVE-2573 > URL: https://issues.apache.org/jira/browse/HIVE-2573 > Project: Hive > Issue Type: Improvement > Components: Server Infrastructure >Reporter: Navis >Assignee: Navis >Priority: Minor > Fix For: 1.2.0 > > Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, > HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, > HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, > HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, > HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, > HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt > > > Currently the function registry is shared resource and could be overrided by > other users when using HiveServer. If per-session function registry is > provided, this situation could be prevented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9138) Add some explain to PTF operator
[ https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9138: Attachment: HIVE-9138.5.patch.txt > Add some explain to PTF operator > > > Key: HIVE-9138 > URL: https://issues.apache.org/jira/browse/HIVE-9138 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, > HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt, HIVE-9138.5.patch.txt > > > PTFOperator does not explain anything in explain statement, making it hard to > understand the internal works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9495) Map Side aggregation affecting map performance
[ https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319590#comment-14319590 ] Navis commented on HIVE-9495: - I think I've broken something rebasing on trunk. > Map Side aggregation affecting map performance > -- > > Key: HIVE-9495 > URL: https://issues.apache.org/jira/browse/HIVE-9495 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.14.0 > Environment: RHEL 6.4 > Hortonworks Hadoop 2.2 >Reporter: Anand Sridharan > Attachments: HIVE-9495.1.patch.txt, profiler_screenshot.PNG > > > When trying to run a simple aggregation query with hive.map.aggr=true, map > tasks take a lot of time in Hive 0.14 as against with hive.map.aggr=false. > e.g. > Consider the query: > {code} > INSERT OVERWRITE TABLE lineitem_tgt_agg > select alias.a0 as a0, > alias.a2 as a1, > alias.a1 as a2, > alias.a3 as a3, > alias.a4 as a4 > from ( > select alias.a0 as a0, > SUM(alias.a1) as a1, > SUM(alias.a2) as a2, > SUM(alias.a3) as a3, > SUM(alias.a4) as a4 > from ( > select lineitem_sf500.l_orderkey as a0, >CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - > lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1, >lineitem_sf500.l_quantity as a2, >CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * > lineitem_sf500.l_discount as double) as a3, >CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * > lineitem_sf500.l_tax as double) as a4 > from lineitem_sf500 > ) alias > group by alias.a0 > ) alias; > {code} > The above query was run with ~376GB of data / ~3billion records in the source. > It takes ~10 minutes with hive.map.aggr=false. > With map side aggregation set to true, the map tasks don't complete even > after an hour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance
[ https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9495: Status: Open (was: Patch Available) > Map Side aggregation affecting map performance > -- > > Key: HIVE-9495 > URL: https://issues.apache.org/jira/browse/HIVE-9495 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.14.0 > Environment: RHEL 6.4 > Hortonworks Hadoop 2.2 >Reporter: Anand Sridharan > Attachments: HIVE-9495.1.patch.txt, profiler_screenshot.PNG > > > When trying to run a simple aggregation query with hive.map.aggr=true, map > tasks take a lot of time in Hive 0.14 as against with hive.map.aggr=false. > e.g. > Consider the query: > {code} > INSERT OVERWRITE TABLE lineitem_tgt_agg > select alias.a0 as a0, > alias.a2 as a1, > alias.a1 as a2, > alias.a3 as a3, > alias.a4 as a4 > from ( > select alias.a0 as a0, > SUM(alias.a1) as a1, > SUM(alias.a2) as a2, > SUM(alias.a3) as a3, > SUM(alias.a4) as a4 > from ( > select lineitem_sf500.l_orderkey as a0, >CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - > lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1, >lineitem_sf500.l_quantity as a2, >CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * > lineitem_sf500.l_discount as double) as a3, >CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * > lineitem_sf500.l_tax as double) as a4 > from lineitem_sf500 > ) alias > group by alias.a0 > ) alias; > {code} > The above query was run with ~376GB of data / ~3billion records in the source. > It takes ~10 minutes with hive.map.aggr=false. > With map side aggregation set to true, the map tasks don't complete even > after an hour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-9597) substition variables stopping when a undefined variable occur
[ https://issues.apache.org/jira/browse/HIVE-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis resolved HIVE-9597. - Resolution: Duplicate Fix Version/s: 0.14.0 > substition variables stopping when a undefined variable occur > - > > Key: HIVE-9597 > URL: https://issues.apache.org/jira/browse/HIVE-9597 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 0.13.0 > Environment: hortonworks 2.1 >Reporter: ErwanMAS >Priority: Critical > Fix For: 0.14.0 > > > {noformat} > set hivevar:A_VALUE_1=A ; > set hivevar:A_VALUE_3=C ; > explain select "${A_VALUE_1}","${A_VALUE_2}","${A_VALUE_3}" from foobar ; > set hivevar:A_VALUE_2=B ; > explain select "${A_VALUE_1}","${A_VALUE_2}","${A_VALUE_3}" from foobar ; > {noformat} > In the first query , the variable A_VALUE_3 is not subsituted , because the > A_VALUE_2 is not defined ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9597) substition variables stopping when a undefined variable occur
[ https://issues.apache.org/jira/browse/HIVE-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319577#comment-14319577 ] Navis commented on HIVE-9597: - This seemed fixed in HIVE-6037(hive-0.14.0). > substition variables stopping when a undefined variable occur > - > > Key: HIVE-9597 > URL: https://issues.apache.org/jira/browse/HIVE-9597 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 0.13.0 > Environment: hortonworks 2.1 >Reporter: ErwanMAS >Priority: Critical > Fix For: 0.14.0 > > > {noformat} > set hivevar:A_VALUE_1=A ; > set hivevar:A_VALUE_3=C ; > explain select "${A_VALUE_1}","${A_VALUE_2}","${A_VALUE_3}" from foobar ; > set hivevar:A_VALUE_2=B ; > explain select "${A_VALUE_1}","${A_VALUE_2}","${A_VALUE_3}" from foobar ; > {noformat} > In the first query , the variable A_VALUE_3 is not subsituted , because the > A_VALUE_2 is not defined ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9680) GlobalLimitOptimizer is not checking filters correctly
[ https://issues.apache.org/jira/browse/HIVE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9680: Attachment: HIVE-9680.1.patch.txt > GlobalLimitOptimizer is not checking filters correctly > --- > > Key: HIVE-9680 > URL: https://issues.apache.org/jira/browse/HIVE-9680 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9680.1.patch.txt > > > Some predicates can be not included in opToPartPruner -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9680) GlobalLimitOptimizer is not checking filters correctly
[ https://issues.apache.org/jira/browse/HIVE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9680: Status: Patch Available (was: Open) > GlobalLimitOptimizer is not checking filters correctly > --- > > Key: HIVE-9680 > URL: https://issues.apache.org/jira/browse/HIVE-9680 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9680.1.patch.txt > > > Some predicates can be not included in opToPartPruner -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9680) GlobalLimitOptimizer is not checking filters correctly
Navis created HIVE-9680: --- Summary: GlobalLimitOptimizer is not checking filters correctly Key: HIVE-9680 URL: https://issues.apache.org/jira/browse/HIVE-9680 Project: Hive Issue Type: Bug Components: Query Planning Reporter: Navis Assignee: Navis Priority: Trivial Some predicates can be not included in opToPartPruner -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance
[ https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9495: Attachment: HIVE-9495.1.patch.txt Replaced get/put call to single putIfAbsent call. But couldn't find any noticeable performance improvement. > Map Side aggregation affecting map performance > -- > > Key: HIVE-9495 > URL: https://issues.apache.org/jira/browse/HIVE-9495 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.14.0 > Environment: RHEL 6.4 > Hortonworks Hadoop 2.2 >Reporter: Anand Sridharan > Attachments: HIVE-9495.1.patch.txt, profiler_screenshot.PNG > > > When trying to run a simple aggregation query with hive.map.aggr=true, map > tasks take a lot of time in Hive 0.14 as against with hive.map.aggr=false. > e.g. > Consider the query: > {code} > INSERT OVERWRITE TABLE lineitem_tgt_agg > select alias.a0 as a0, > alias.a2 as a1, > alias.a1 as a2, > alias.a3 as a3, > alias.a4 as a4 > from ( > select alias.a0 as a0, > SUM(alias.a1) as a1, > SUM(alias.a2) as a2, > SUM(alias.a3) as a3, > SUM(alias.a4) as a4 > from ( > select lineitem_sf500.l_orderkey as a0, >CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - > lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1, >lineitem_sf500.l_quantity as a2, >CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * > lineitem_sf500.l_discount as double) as a3, >CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * > lineitem_sf500.l_tax as double) as a4 > from lineitem_sf500 > ) alias > group by alias.a0 > ) alias; > {code} > The above query was run with ~376GB of data / ~3billion records in the source. > It takes ~10 minutes with hive.map.aggr=false. > With map side aggregation set to true, the map tasks don't complete even > after an hour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance
[ https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9495: Status: Patch Available (was: Open) > Map Side aggregation affecting map performance > -- > > Key: HIVE-9495 > URL: https://issues.apache.org/jira/browse/HIVE-9495 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.14.0 > Environment: RHEL 6.4 > Hortonworks Hadoop 2.2 >Reporter: Anand Sridharan > Attachments: HIVE-9495.1.patch.txt, profiler_screenshot.PNG > > > When trying to run a simple aggregation query with hive.map.aggr=true, map > tasks take a lot of time in Hive 0.14 as against with hive.map.aggr=false. > e.g. > Consider the query: > {code} > INSERT OVERWRITE TABLE lineitem_tgt_agg > select alias.a0 as a0, > alias.a2 as a1, > alias.a1 as a2, > alias.a3 as a3, > alias.a4 as a4 > from ( > select alias.a0 as a0, > SUM(alias.a1) as a1, > SUM(alias.a2) as a2, > SUM(alias.a3) as a3, > SUM(alias.a4) as a4 > from ( > select lineitem_sf500.l_orderkey as a0, >CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - > lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1, >lineitem_sf500.l_quantity as a2, >CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * > lineitem_sf500.l_discount as double) as a3, >CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * > lineitem_sf500.l_tax as double) as a4 > from lineitem_sf500 > ) alias > group by alias.a0 > ) alias; > {code} > The above query was run with ~376GB of data / ~3billion records in the source. > It takes ~10 minutes with hive.map.aggr=false. > With map side aggregation set to true, the map tasks don't complete even > after an hour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-9598) java.lang.IllegalMonitorStateException/java.util.concurrent.locks.ReentrantLock$Sync.tryRelease if ResultSet.closed called after Statement.close called
[ https://issues.apache.org/jira/browse/HIVE-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis resolved HIVE-9598. - Resolution: Duplicate Fix Version/s: 0.14.0 > java.lang.IllegalMonitorStateException/java.util.concurrent.locks.ReentrantLock$Sync.tryRelease > if ResultSet.closed called after Statement.close called > --- > > Key: HIVE-9598 > URL: https://issues.apache.org/jira/browse/HIVE-9598 > Project: Hive > Issue Type: Bug > Components: JDBC >Affects Versions: 0.13.0 >Reporter: N Campbell > Fix For: 0.14.0 > > > http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#close() > http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#close() > Statement stmt; > try { > stmt = dbConnection.createStatement(); > stmt.executeQuery("select* from t"); > ResultSet rs = stmt.getResultSet(); > stmt.close(); > if (rs != null) { > System.out.println("IS NOT NULL"); > // Hive does not implement isClosed() > //if (!rs.isClosed()) { > //System.out.println("IS NOT CLOSED"); > //} > rs.close(); > } > } catch (SQLException e) { > // TODO Auto-generated catch block > e.printStackTrace(); > } > Exception in thread "main" java.lang.IllegalMonitorStateException > at > java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:166) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1271) > at > java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:471) > at > org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:175) > at > org.apache.hive.jdbc.HiveQueryResultSet.close(HiveQueryResultSet.java:293) > /D:/JDBC/Hortonworks_Hive13/commons-configuration-1.6.jar > /D:/JDBC/Hortonworks_Hive13/commons-logging-1.1.3.jar > /D:/JDBC/Hortonworks_Hive13/hadoop-common-2.4.0.2.1.1.0-385.jar > /D:/JDBC/Hortonworks_Hive13/hive-exec-0.13.0.2.1.1.0-385.jar > /D:/JDBC/Hortonworks_Hive13/hive-jdbc-0.13.0.2.1.1.0-385.jar > /D:/JDBC/Hortonworks_Hive13/hive-service-0.13.0.2.1.1.0-385.jar > /D:/JDBC/Hortonworks_Hive13/httpclient-4.2.5.jar > /D:/JDBC/Hortonworks_Hive13/httpcore-4.2.5.jar > /D:/JDBC/Hortonworks_Hive13/libfb303-0.9.0.jar > /D:/JDBC/Hortonworks_Hive13/libthrift-0.9.0.jar > /D:/JDBC/Hortonworks_Hive13/log4j-1.2.16.jar > /D:/JDBC/Hortonworks_Hive13/slf4j-api-1.7.5.jar > /D:/JDBC/Hortonworks_Hive13/slf4j-log4j12-1.7.5.jar -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9632) inconsistent results between year(), month(), day(), and the actual values in formulas
[ https://issues.apache.org/jira/browse/HIVE-9632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317746#comment-14317746 ] Navis commented on HIVE-9632: - Looks like HIVE-9278. Could you check this in hive-1.0? > inconsistent results between year(), month(), day(), and the actual values in > formulas > -- > > Key: HIVE-9632 > URL: https://issues.apache.org/jira/browse/HIVE-9632 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 0.14.0 > Environment: CentOS 6.5, HDP 2.2 >Reporter: Robert Miller > > In wanting to create a date dimension value which would match our existing > database environment, I figured I would be able to do as I have done in the > past and use the following formula: > (year(date)*1)+(month(date)*100)+day(date) > Given the date of 2015-01-09, the above formula should result in a value of > 20150109. Instead, the resulting value is 20353515. > SELECT > > adjusted_activity_date_utc, > > year(adjusted_activity_date_utc), > > month(adjusted_activity_date_utc), > > day(adjusted_activity_date_utc), > > > (year(adjusted_activity_date_utc)*1)+(month(adjusted_activity_date_utc)*100)+day(adjusted_activity_date_utc), > > (year(adjusted_activity_date_utc)*1), > > (month(adjusted_activity_date_utc)*100), > > day(adjusted_activity_date_utc) > > from event_histories limit 5; > OK > adjusted_activity_date_utc_c1 _c2 _c3 _c4 _c5 _c6 > _c7 > 2015-01-0920151 9 203535152015100 > 9 > 2015-01-0920151 9 203535152015100 > 9 > 2015-01-0920151 9 203535152015100 > 9 > 2015-01-0920151 9 203535152015100 > 9 > 2015-01-0920151 9 203535152015100 > 9 > Oddly enough, this works as expected when a specific date value is used for > the column. > I have tried this with partition and non-partition columns and found the > result to be the same. > SELECT > > adjusted_activity_date_utc, > > year(adjusted_activity_date_utc), > > month(adjusted_activity_date_utc), > > day(adjusted_activity_date_utc), > > > (year(adjusted_activity_date_utc)*1)+(month(adjusted_activity_date_utc)*100)+day(adjusted_activity_date_utc), > > (year(adjusted_activity_date_utc)*1), > > (month(adjusted_activity_date_utc)*100), > > day(adjusted_activity_date_utc) > > from event_histories > > where adjusted_activity_date_utc = '2015-01-09' > > limit 5; > OK > adjusted_activity_date_utc_c1 _c2 _c3 _c4 _c5 _c6 > _c7 > 2015-01-0920151 9 201501092015100 > 9 > 2015-01-0920151 9 201501092015100 > 9 > 2015-01-0920151 9 201501092015100 > 9 > 2015-01-0920151 9 201501092015100 > 9 > 2015-01-0920151 9 201501092015100 > 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9138) Add some explain to PTF operator
[ https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9138: Attachment: HIVE-9138.4.patch.txt Missed one file > Add some explain to PTF operator > > > Key: HIVE-9138 > URL: https://issues.apache.org/jira/browse/HIVE-9138 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, > HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt > > > PTFOperator does not explain anything in explain statement, making it hard to > understand the internal works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9138) Add some explain to PTF operator
[ https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9138: Attachment: (was: HIVE-9138.4.patch.txt) > Add some explain to PTF operator > > > Key: HIVE-9138 > URL: https://issues.apache.org/jira/browse/HIVE-9138 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, > HIVE-9138.3.patch.txt > > > PTFOperator does not explain anything in explain statement, making it hard to > understand the internal works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9138) Add some explain to PTF operator
[ https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317645#comment-14317645 ] Navis commented on HIVE-9138: - Wish HIVE-6470 applied to trunk some day. I hate bad indentation. > Add some explain to PTF operator > > > Key: HIVE-9138 > URL: https://issues.apache.org/jira/browse/HIVE-9138 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, > HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt > > > PTFOperator does not explain anything in explain statement, making it hard to > understand the internal works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9138) Add some explain to PTF operator
[ https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9138: Attachment: HIVE-9138.4.patch.txt Addressed comments > Add some explain to PTF operator > > > Key: HIVE-9138 > URL: https://issues.apache.org/jira/browse/HIVE-9138 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, > HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt > > > PTFOperator does not explain anything in explain statement, making it hard to > understand the internal works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing
[ https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9618: Attachment: HIVE-9618.3.patch.txt > Deduplicate RS keys for ptf/windowing > - > > Key: HIVE-9618 > URL: https://issues.apache.org/jira/browse/HIVE-9618 > Project: Hive > Issue Type: Improvement > Components: PTF-Windowing >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9618.1.patch.txt, HIVE-9618.2.patch.txt, > HIVE-9618.3.patch.txt > > > Currently, partition spec containing same column for partition-by and > order-by makes duplicated key column for RS. For example, > {noformat} > explain > select p_mfgr, p_name, p_size, > rank() over (partition by p_mfgr order by p_name) as r, > dense_rank() over (partition by p_mfgr order by p_name) as dr, > sum(p_retailprice) over (partition by p_mfgr order by p_name rows between > unbounded preceding and current row) as s1 > from noop(on noopwithmap(on noop(on part > partition by p_mfgr > order by p_mfgr, p_name > ))) > {noformat} > "partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns > like below > {noformat} > Reduce Output Operator > key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name > (type: string) > sort order: +++ > Map-reduce partition columns: p_mfgr (type: string) > value expressions: p_size (type: int), p_retailprice (type: double) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing
[ https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9618: Status: Patch Available (was: Open) Rebased to trunk > Deduplicate RS keys for ptf/windowing > - > > Key: HIVE-9618 > URL: https://issues.apache.org/jira/browse/HIVE-9618 > Project: Hive > Issue Type: Improvement > Components: PTF-Windowing >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9618.1.patch.txt, HIVE-9618.2.patch.txt, > HIVE-9618.3.patch.txt > > > Currently, partition spec containing same column for partition-by and > order-by makes duplicated key column for RS. For example, > {noformat} > explain > select p_mfgr, p_name, p_size, > rank() over (partition by p_mfgr order by p_name) as r, > dense_rank() over (partition by p_mfgr order by p_name) as dr, > sum(p_retailprice) over (partition by p_mfgr order by p_name rows between > unbounded preceding and current row) as s1 > from noop(on noopwithmap(on noop(on part > partition by p_mfgr > order by p_mfgr, p_name > ))) > {noformat} > "partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns > like below > {noformat} > Reduce Output Operator > key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name > (type: string) > sort order: +++ > Map-reduce partition columns: p_mfgr (type: string) > value expressions: p_size (type: int), p_retailprice (type: double) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9138) Add some explain to PTF operator
[ https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317444#comment-14317444 ] Navis commented on HIVE-9138: - Explainable was introduced to avoid implementing Serializable just for explain result. I can remove this but PTFInputDef, etc. should be Serializable. Changes in ColumnPruner are basically for setting output shape of partition function for explain. It's transient fields just for building PTF at first time and seemed safe to change. > Add some explain to PTF operator > > > Key: HIVE-9138 > URL: https://issues.apache.org/jira/browse/HIVE-9138 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, > HIVE-9138.3.patch.txt > > > PTFOperator does not explain anything in explain statement, making it hard to > understand the internal works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-2573) Create per-session function registry
[ https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2573: Attachment: HIVE-2573.15.patch.txt Addressed comments (exept one) and cannot reproduce fails on TestMacroSemanticAnalyzer. > Create per-session function registry > - > > Key: HIVE-2573 > URL: https://issues.apache.org/jira/browse/HIVE-2573 > Project: Hive > Issue Type: Improvement > Components: Server Infrastructure >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, > HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, > HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, > HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, > HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, > HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt > > > Currently the function registry is shared resource and could be overrided by > other users when using HiveServer. If per-session function registry is > provided, this situation could be prevented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
[ https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313743#comment-14313743 ] Navis commented on HIVE-9507: - Yes, just fixed NPE. > Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls > > > Key: HIVE-9507 > URL: https://issues.apache.org/jira/browse/HIVE-9507 > Project: Hive > Issue Type: Bug > Components: Query Processor, UDF >Affects Versions: 0.14.0 > Environment: hdp 2.2 > Windows server 2012 R2 64-bit >Reporter: Moustafa Aboul Atta >Assignee: Navis >Priority: Minor > Fix For: 1.2.0 > > Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, > HIVE-9507.3.patch.txt, parial_log.log > > > I have tweets stored with avro on hdfs with the default twitter status > (tweet) schema. > There's an object called "entities" that contains arrays of structs. > When I run > > {{SELECT mytable.*}} > {{FROM tweets}} > {{LATERAL VIEW INLINE(entities.media) mytable}} > I get the exception attached as partial_log.log, however, if I add > {{WHERE entities.media IS NOT NULL}} > it runs perfectly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
[ https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9507: Resolution: Fixed Fix Version/s: 1.2.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Ashutosh. > Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls > > > Key: HIVE-9507 > URL: https://issues.apache.org/jira/browse/HIVE-9507 > Project: Hive > Issue Type: Bug > Components: Query Processor, UDF >Affects Versions: 0.14.0 > Environment: hdp 2.2 > Windows server 2012 R2 64-bit >Reporter: Moustafa Aboul Atta >Assignee: Navis >Priority: Minor > Fix For: 1.2.0 > > Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, > HIVE-9507.3.patch.txt, parial_log.log > > > I have tweets stored with avro on hdfs with the default twitter status > (tweet) schema. > There's an object called "entities" that contains arrays of structs. > When I run > > {{SELECT mytable.*}} > {{FROM tweets}} > {{LATERAL VIEW INLINE(entities.media) mytable}} > I get the exception attached as partial_log.log, however, if I add > {{WHERE entities.media IS NOT NULL}} > it runs perfectly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing
[ https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9618: Attachment: HIVE-9618.2.patch.txt Addressed comment & updated gold file > Deduplicate RS keys for ptf/windowing > - > > Key: HIVE-9618 > URL: https://issues.apache.org/jira/browse/HIVE-9618 > Project: Hive > Issue Type: Improvement > Components: PTF-Windowing >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9618.1.patch.txt, HIVE-9618.2.patch.txt > > > Currently, partition spec containing same column for partition-by and > order-by makes duplicated key column for RS. For example, > {noformat} > explain > select p_mfgr, p_name, p_size, > rank() over (partition by p_mfgr order by p_name) as r, > dense_rank() over (partition by p_mfgr order by p_name) as dr, > sum(p_retailprice) over (partition by p_mfgr order by p_name rows between > unbounded preceding and current row) as s1 > from noop(on noopwithmap(on noop(on part > partition by p_mfgr > order by p_mfgr, p_name > ))) > {noformat} > "partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns > like below > {noformat} > Reduce Output Operator > key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name > (type: string) > sort order: +++ > Map-reduce partition columns: p_mfgr (type: string) > value expressions: p_size (type: int), p_retailprice (type: double) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9486) Use session classloader instead of application loader
[ https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313515#comment-14313515 ] Navis commented on HIVE-9486: - [~szehon] I've considered to use 'Utilities.getSessionSpecifiedClassLoader' but it seemed better to have one in common module using JavaUtils.getClassLoader() which is safe to call without hive-exec or other modules. We modifies SessionState.HiveConf.ClassLoader and thread context loader altogether (at least in hive) and it would be the same. Better idea? > Use session classloader instead of application loader > - > > Key: HIVE-9486 > URL: https://issues.apache.org/jira/browse/HIVE-9486 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: HIVE-9486.1.patch.txt, HIVE-9486.2.patch.txt > > > From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html > Looks reasonable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
[ https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9507: Attachment: HIVE-9507.3.patch.txt > Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls > > > Key: HIVE-9507 > URL: https://issues.apache.org/jira/browse/HIVE-9507 > Project: Hive > Issue Type: Bug > Components: Query Processor, UDF >Affects Versions: 0.14.0 > Environment: hdp 2.2 > Windows server 2012 R2 64-bit >Reporter: Moustafa Aboul Atta >Assignee: Navis >Priority: Minor > Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, > HIVE-9507.3.patch.txt, parial_log.log > > > I have tweets stored with avro on hdfs with the default twitter status > (tweet) schema. > There's an object called "entities" that contains arrays of structs. > When I run > > {{SELECT mytable.*}} > {{FROM tweets}} > {{LATERAL VIEW INLINE(entities.media) mytable}} > I get the exception attached as partial_log.log, however, if I add > {{WHERE entities.media IS NOT NULL}} > it runs perfectly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing
[ https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9618: Attachment: HIVE-9618.1.patch.txt > Deduplicate RS keys for ptf/windowing > - > > Key: HIVE-9618 > URL: https://issues.apache.org/jira/browse/HIVE-9618 > Project: Hive > Issue Type: Improvement > Components: PTF-Windowing >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9618.1.patch.txt > > > Currently, partition spec containing same column for partition-by and > order-by makes duplicated key column for RS. For example, > {noformat} > explain > select p_mfgr, p_name, p_size, > rank() over (partition by p_mfgr order by p_name) as r, > dense_rank() over (partition by p_mfgr order by p_name) as dr, > sum(p_retailprice) over (partition by p_mfgr order by p_name rows between > unbounded preceding and current row) as s1 > from noop(on noopwithmap(on noop(on part > partition by p_mfgr > order by p_mfgr, p_name > ))) > {noformat} > "partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns > like below > {noformat} > Reduce Output Operator > key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name > (type: string) > sort order: +++ > Map-reduce partition columns: p_mfgr (type: string) > value expressions: p_size (type: int), p_retailprice (type: double) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9618) Deduplicate RS keys for ptf/windowing
[ https://issues.apache.org/jira/browse/HIVE-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9618: Status: Patch Available (was: Open) > Deduplicate RS keys for ptf/windowing > - > > Key: HIVE-9618 > URL: https://issues.apache.org/jira/browse/HIVE-9618 > Project: Hive > Issue Type: Improvement > Components: PTF-Windowing >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9618.1.patch.txt > > > Currently, partition spec containing same column for partition-by and > order-by makes duplicated key column for RS. For example, > {noformat} > explain > select p_mfgr, p_name, p_size, > rank() over (partition by p_mfgr order by p_name) as r, > dense_rank() over (partition by p_mfgr order by p_name) as dr, > sum(p_retailprice) over (partition by p_mfgr order by p_name rows between > unbounded preceding and current row) as s1 > from noop(on noopwithmap(on noop(on part > partition by p_mfgr > order by p_mfgr, p_name > ))) > {noformat} > "partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns > like below > {noformat} > Reduce Output Operator > key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name > (type: string) > sort order: +++ > Map-reduce partition columns: p_mfgr (type: string) > value expressions: p_size (type: int), p_retailprice (type: double) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9618) Deduplicate RS keys for ptf/windowing
Navis created HIVE-9618: --- Summary: Deduplicate RS keys for ptf/windowing Key: HIVE-9618 URL: https://issues.apache.org/jira/browse/HIVE-9618 Project: Hive Issue Type: Improvement Components: PTF-Windowing Reporter: Navis Assignee: Navis Priority: Trivial Currently, partition spec containing same column for partition-by and order-by makes duplicated key column for RS. For example, {noformat} explain select p_mfgr, p_name, p_size, rank() over (partition by p_mfgr order by p_name) as r, dense_rank() over (partition by p_mfgr order by p_name) as dr, sum(p_retailprice) over (partition by p_mfgr order by p_name rows between unbounded preceding and current row) as s1 from noop(on noopwithmap(on noop(on part partition by p_mfgr order by p_mfgr, p_name ))) {noformat} "partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns like below {noformat} Reduce Output Operator key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name (type: string) sort order: +++ Map-reduce partition columns: p_mfgr (type: string) value expressions: p_size (type: int), p_retailprice (type: double) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9228) Problem with subquery using windowing functions
[ https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9228: Attachment: HIVE-9228.3.patch.txt Updated gold file > Problem with subquery using windowing functions > --- > > Key: HIVE-9228 > URL: https://issues.apache.org/jira/browse/HIVE-9228 > Project: Hive > Issue Type: Bug > Components: PTF-Windowing >Affects Versions: 0.13.1 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, > HIVE-9228.3.patch.txt, create_table_tab1.sql, tab1.csv > > Original Estimate: 96h > Remaining Estimate: 96h > > The following query with window functions failed. The internal query works > fine. > select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 > then 1 end ) over (partition by col1, col2) as col5, row_number() over > (partition by col1, col2 order by col4) as col6 from tab1) t; > HIVE generates an execution plan with 2 jobs. > 1. The first job is to basically calculate window function for col5. > 2. The second job is to calculate window function for col6 and output. > The plan says the first job outputs the columns (col1, col2, col3, col4) to a > tmp file since only these columns are used in later stage. While, the PTF > operator for the first job outputs (_wcol0, col1, col2, col3, col4) with > _wcol0 as the result of the window function even it's not used. > In the second job, the map operator still reads the 4 columns (col1, col2, > col3, col4) from the temp file using the plan. That causes the exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9228) Problem with subquery using windowing functions
[ https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311850#comment-14311850 ] Navis commented on HIVE-9228: - [~aihuaxu] Sorry for my breaking in on this issue. I've been working on codes around CP for other issues and not wanted others waste time to understand complicated PTF operation. I think the fix is almost done. Sorry again. > Problem with subquery using windowing functions > --- > > Key: HIVE-9228 > URL: https://issues.apache.org/jira/browse/HIVE-9228 > Project: Hive > Issue Type: Bug > Components: PTF-Windowing >Affects Versions: 0.13.1 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, > create_table_tab1.sql, tab1.csv > > Original Estimate: 96h > Remaining Estimate: 96h > > The following query with window functions failed. The internal query works > fine. > select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 > then 1 end ) over (partition by col1, col2) as col5, row_number() over > (partition by col1, col2 order by col4) as col6 from tab1) t; > HIVE generates an execution plan with 2 jobs. > 1. The first job is to basically calculate window function for col5. > 2. The second job is to calculate window function for col6 and output. > The plan says the first job outputs the columns (col1, col2, col3, col4) to a > tmp file since only these columns are used in later stage. While, the PTF > operator for the first job outputs (_wcol0, col1, col2, col3, col4) with > _wcol0 as the result of the window function even it's not used. > In the second job, the map operator still reads the 4 columns (col1, col2, > col3, col4) from the temp file using the plan. That causes the exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9615) Provide limit context for storage handlers
[ https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9615: Description: Propagate limit context generated from GlobalLimitOptimizer to storage handlers. (was: Propagate limit context generated from GlobalLimitOptimizer to strorage handlers.) > Provide limit context for storage handlers > -- > > Key: HIVE-9615 > URL: https://issues.apache.org/jira/browse/HIVE-9615 > Project: Hive > Issue Type: Improvement > Components: StorageHandler >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9615.1.patch.txt > > > Propagate limit context generated from GlobalLimitOptimizer to storage > handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9615) Provide limit context for storage handlers
[ https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9615: Attachment: HIVE-9615.1.patch.txt Old patch found from git stash > Provide limit context for storage handlers > -- > > Key: HIVE-9615 > URL: https://issues.apache.org/jira/browse/HIVE-9615 > Project: Hive > Issue Type: Improvement > Components: StorageHandler >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9615.1.patch.txt > > > Propagate limit context generated from GlobalLimitOptimizer to strorage > handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9615) Provide limit context for storage handlers
[ https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9615: Status: Patch Available (was: Open) > Provide limit context for storage handlers > -- > > Key: HIVE-9615 > URL: https://issues.apache.org/jira/browse/HIVE-9615 > Project: Hive > Issue Type: Improvement > Components: StorageHandler >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9615.1.patch.txt > > > Propagate limit context generated from GlobalLimitOptimizer to strorage > handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9615) Provide limit context for storage handlers
Navis created HIVE-9615: --- Summary: Provide limit context for storage handlers Key: HIVE-9615 URL: https://issues.apache.org/jira/browse/HIVE-9615 Project: Hive Issue Type: Improvement Components: StorageHandler Reporter: Navis Assignee: Navis Priority: Trivial Propagate limit context generated from GlobalLimitOptimizer to strorage handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-3050) JDBC should provide metadata for columns whether a column is a partition column or not
[ https://issues.apache.org/jira/browse/HIVE-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3050: Attachment: HIVE-3050.1.patch.txt > JDBC should provide metadata for columns whether a column is a partition > column or not > -- > > Key: HIVE-3050 > URL: https://issues.apache.org/jira/browse/HIVE-3050 > Project: Hive > Issue Type: Improvement > Components: JDBC >Affects Versions: 0.10.0 >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: HIVE-3050.1.patch.txt > > > Trivial request from UI developers. > {code} > DatabaseMetaData databaseMetaData = connection.getMetaData(); > ResultSet rs = databaseMetaData.getColumns(null, null, "tableName", null); > > boolean partitionKey = rs.getBoolean("IS_PARTITION_COLUMN"); > {code} > It's not JDBC standard column but seemed to be useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9499: Attachment: HIVE-9499.3.patch.txt Rebased to trunk > hive.limit.query.max.table.partition makes queries fail on non-partitioned > tables > - > > Key: HIVE-9499 > URL: https://issues.apache.org/jira/browse/HIVE-9499 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Alexander Kasper >Assignee: Navis > Attachments: HIVE-9499.1.patch.txt, HIVE-9499.2.patch.txt, > HIVE-9499.3.patch.txt > > > If you use hive.limit.query.max.table.partition to limit the amount of > partitions that can be queried it makes queries on non-partitioned tables > fail. > Example: > {noformat} > CREATE TABLE tmp(test INT); > SELECT COUNT(*) FROM TMP; -- works fine > SET hive.limit.query.max.table.partition=20; > SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null) > SET hive.limit.query.max.table.partition=-1; > SELECT COUNT(*) FROM TMP; -- works fine again > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
[ https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9507: Attachment: HIVE-9507.2.patch.txt Reattaching for test > Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls > > > Key: HIVE-9507 > URL: https://issues.apache.org/jira/browse/HIVE-9507 > Project: Hive > Issue Type: Bug > Components: Query Processor, UDF >Affects Versions: 0.14.0 > Environment: hdp 2.2 > Windows server 2012 R2 64-bit >Reporter: Moustafa Aboul Atta >Assignee: Navis >Priority: Minor > Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, > parial_log.log > > > I have tweets stored with avro on hdfs with the default twitter status > (tweet) schema. > There's an object called "entities" that contains arrays of structs. > When I run > > {{SELECT mytable.*}} > {{FROM tweets}} > {{LATERAL VIEW INLINE(entities.media) mytable}} > I get the exception attached as partial_log.log, however, if I add > {{WHERE entities.media IS NOT NULL}} > it runs perfectly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9513) NULL POINTER EXCEPTION
[ https://issues.apache.org/jira/browse/HIVE-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9513: Attachment: HIVE-9513.2.patch.txt > NULL POINTER EXCEPTION > -- > > Key: HIVE-9513 > URL: https://issues.apache.org/jira/browse/HIVE-9513 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 0.13.1 >Reporter: ErwanMAS >Assignee: Navis > Attachments: HIVE-9513.1.patch.txt, HIVE-9513.2.patch.txt > > > NPE duting parsing of : > {noformat} > select * from ( > select * from ( select 1 as id , "foo" as str_1 from staging.dual ) f > union all > select * from ( select 2 as id , "bar" as str_2 from staging.dual ) g > ) e ; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-2573) Create per-session function registry
[ https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2573: Attachment: (was: HIVE-2573.14.patch.txt) > Create per-session function registry > - > > Key: HIVE-2573 > URL: https://issues.apache.org/jira/browse/HIVE-2573 > Project: Hive > Issue Type: Improvement > Components: Server Infrastructure >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, > HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, > HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, > HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, > HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, > HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt > > > Currently the function registry is shared resource and could be overrided by > other users when using HiveServer. If per-session function registry is > provided, this situation could be prevented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-2573) Create per-session function registry
[ https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2573: Attachment: HIVE-2573.14.patch.txt > Create per-session function registry > - > > Key: HIVE-2573 > URL: https://issues.apache.org/jira/browse/HIVE-2573 > Project: Hive > Issue Type: Improvement > Components: Server Infrastructure >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, > HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, > HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, > HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, > HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, > HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt > > > Currently the function registry is shared resource and could be overrided by > other users when using HiveServer. If per-session function registry is > provided, this situation could be prevented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-2573) Create per-session function registry
[ https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2573: Attachment: HIVE-2573.14.patch.txt Forgot this for a long time. Rebased to trunk. > Create per-session function registry > - > > Key: HIVE-2573 > URL: https://issues.apache.org/jira/browse/HIVE-2573 > Project: Hive > Issue Type: Improvement > Components: Server Infrastructure >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, > HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, > HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, > HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, > HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, > HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt > > > Currently the function registry is shared resource and could be overrided by > other users when using HiveServer. If per-session function registry is > provided, this situation could be prevented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count
[ https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308401#comment-14308401 ] Navis commented on HIVE-6099: - +1 > Multi insert does not work properly with distinct count > --- > > Key: HIVE-6099 > URL: https://issues.apache.org/jira/browse/HIVE-6099 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0 >Reporter: Pavan Gadam Manohar >Assignee: Ashutosh Chauhan > Labels: count, distinct, insert, multi-insert > Attachments: HIVE-6099.1.patch, HIVE-6099.2.patch, HIVE-6099.3.patch, > HIVE-6099.4.patch, HIVE-6099.patch, explain_hive_0.10.0.txt, > with_disabled.txt, with_enabled.txt > > > Need 2 rows to reproduce this Bug. Here are the steps. > Step 1) Create a table Table_A > CREATE EXTERNAL TABLE Table_A > ( > user string > , type int > ) > PARTITIONED BY (dt string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS RCFILE > LOCATION '/hive//Table_A'; > Step 2) Scenario: Lets us say consider user tommy belong to both usertypes > 111 and 123. Insert 2 records into the table created above. > select * from Table_A; > hive> select * from table_a; > OK > tommy 123 2013-12-02 > tommy 111 2013-12-02 > Step 3) Create 2 destination tables to simulate multi-insert. > CREATE EXTERNAL TABLE dest_Table_A > ( > p_date string > , Distinct_Users int > , Type111Users int > , Type123Users int > ) > PARTITIONED BY (dt string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS RCFILE > LOCATION '/hive//dest_Table_A'; > > CREATE EXTERNAL TABLE dest_Table_B > ( > p_date string > , Distinct_Users int > , Type111Users int > , Type123Users int > ) > PARTITIONED BY (dt string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS RCFILE > LOCATION '/hive//dest_Table_B'; > Step 4) Multi insert statement > from Table_A a > INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02') > select a.dt > ,count(distinct a.user) as AllDist > ,count(distinct case when a.type = 111 then a.user else null end) as > Type111User > ,count(distinct case when a.type != 111 then a.user else null end) as > Type123User > group by a.dt > > INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02') > select a.dt > ,count(distinct a.user) as AllDist > ,count(distinct case when a.type = 111 then a.user else null end) as > Type111User > ,count(distinct case when a.type != 111 then a.user else null end) as > Type123User > group by a.dt > ; > > Step 5) Verify results. > hive> select * from dest_table_a; > OK > 2013-12-02 2 1 1 2013-12-02 > Time taken: 0.116 seconds > hive> select * from dest_table_b; > OK > 2013-12-02 2 1 1 2013-12-02 > Time taken: 0.13 seconds > Conclusion: Hive gives a count of 2 for distinct users although there is > only one distinct user. After trying many datasets observed that Hive is > doing Type111Users + Typoe123Users = DistinctUsers which is wrong. > hive> select count(distinct a.user) from table_a a; > Gives: > Total MapReduce CPU Time Spent: 4 seconds 350 msec > OK > 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9545) Build FAILURE with IBM JVM
[ https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14306641#comment-14306641 ] Navis commented on HIVE-9545: - [~ashutoshc] Could you review this? Simple changes of method invocation to reflection. > Build FAILURE with IBM JVM > --- > > Key: HIVE-9545 > URL: https://issues.apache.org/jira/browse/HIVE-9545 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 > Environment: mvn -version > Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; > 2014-08-11T22:58:10+02:00) > Maven home: /opt/apache-maven-3.2.3 > Java version: 1.7.0, vendor: IBM Corporation > Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre > Default locale: en_US, platform encoding: ISO-8859-1 > OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", > family: "unix" >Reporter: pascal oliva >Assignee: Navis > Attachments: HIVE-9545.1.patch.txt > > > NO PRECOMMIT TESTS > With the use of IBM JVM environment : > [root@dorado-vm2 hive]# java -version > java version "1.7.0" > Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2)) > IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References > 20141017_217728 (JIT enabled, AOT enabled). > The build failed on > [INFO] Hive Query Language FAILURE [ 50.053 > s] > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hive-exec: Compilation failure: Compilation failure: > [ERROR] > /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26] > package com.sun.management does not exist. > HOWTO : > #git clone -b branch-0.14 https://github.com/apache/hive.git > #cd hive > #mvn install -DskipTests -Phadoop-2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count
[ https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304657#comment-14304657 ] Navis commented on HIVE-6099: - It's introduced to generated distinct keys for this optimization and seemed not used by other codes. The optimization seemed working with single common distinct column, but I think the overhead for it overrides the good part (and hard to read). But.. let's see the result of test. > Multi insert does not work properly with distinct count > --- > > Key: HIVE-6099 > URL: https://issues.apache.org/jira/browse/HIVE-6099 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0 >Reporter: Pavan Gadam Manohar >Assignee: Ashutosh Chauhan > Labels: count, distinct, insert, multi-insert > Attachments: HIVE-6099.1.patch, HIVE-6099.2.patch, HIVE-6099.patch, > explain_hive_0.10.0.txt, with_disabled.txt, with_enabled.txt > > > Need 2 rows to reproduce this Bug. Here are the steps. > Step 1) Create a table Table_A > CREATE EXTERNAL TABLE Table_A > ( > user string > , type int > ) > PARTITIONED BY (dt string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS RCFILE > LOCATION '/hive//Table_A'; > Step 2) Scenario: Lets us say consider user tommy belong to both usertypes > 111 and 123. Insert 2 records into the table created above. > select * from Table_A; > hive> select * from table_a; > OK > tommy 123 2013-12-02 > tommy 111 2013-12-02 > Step 3) Create 2 destination tables to simulate multi-insert. > CREATE EXTERNAL TABLE dest_Table_A > ( > p_date string > , Distinct_Users int > , Type111Users int > , Type123Users int > ) > PARTITIONED BY (dt string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS RCFILE > LOCATION '/hive//dest_Table_A'; > > CREATE EXTERNAL TABLE dest_Table_B > ( > p_date string > , Distinct_Users int > , Type111Users int > , Type123Users int > ) > PARTITIONED BY (dt string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS RCFILE > LOCATION '/hive//dest_Table_B'; > Step 4) Multi insert statement > from Table_A a > INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02') > select a.dt > ,count(distinct a.user) as AllDist > ,count(distinct case when a.type = 111 then a.user else null end) as > Type111User > ,count(distinct case when a.type != 111 then a.user else null end) as > Type123User > group by a.dt > > INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02') > select a.dt > ,count(distinct a.user) as AllDist > ,count(distinct case when a.type = 111 then a.user else null end) as > Type111User > ,count(distinct case when a.type != 111 then a.user else null end) as > Type123User > group by a.dt > ; > > Step 5) Verify results. > hive> select * from dest_table_a; > OK > 2013-12-02 2 1 1 2013-12-02 > Time taken: 0.116 seconds > hive> select * from dest_table_b; > OK > 2013-12-02 2 1 1 2013-12-02 > Time taken: 0.13 seconds > Conclusion: Hive gives a count of 2 for distinct users although there is > only one distinct user. After trying many datasets observed that Hive is > doing Type111Users + Typoe123Users = DistinctUsers which is wrong. > hive> select count(distinct a.user) from table_a a; > Gives: > Total MapReduce CPU Time Spent: 4 seconds 350 msec > OK > 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count
[ https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304503#comment-14304503 ] Navis commented on HIVE-6099: - [~ashutoshc] Good! I've leaved some comments in rb. I think we are purging the most complicated parts in GroupByOperator. > Multi insert does not work properly with distinct count > --- > > Key: HIVE-6099 > URL: https://issues.apache.org/jira/browse/HIVE-6099 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0 >Reporter: Pavan Gadam Manohar >Assignee: Ashutosh Chauhan > Labels: count, distinct, insert, multi-insert > Attachments: HIVE-6099.1.patch, HIVE-6099.patch, > explain_hive_0.10.0.txt, with_disabled.txt, with_enabled.txt > > > Need 2 rows to reproduce this Bug. Here are the steps. > Step 1) Create a table Table_A > CREATE EXTERNAL TABLE Table_A > ( > user string > , type int > ) > PARTITIONED BY (dt string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS RCFILE > LOCATION '/hive//Table_A'; > Step 2) Scenario: Lets us say consider user tommy belong to both usertypes > 111 and 123. Insert 2 records into the table created above. > select * from Table_A; > hive> select * from table_a; > OK > tommy 123 2013-12-02 > tommy 111 2013-12-02 > Step 3) Create 2 destination tables to simulate multi-insert. > CREATE EXTERNAL TABLE dest_Table_A > ( > p_date string > , Distinct_Users int > , Type111Users int > , Type123Users int > ) > PARTITIONED BY (dt string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS RCFILE > LOCATION '/hive//dest_Table_A'; > > CREATE EXTERNAL TABLE dest_Table_B > ( > p_date string > , Distinct_Users int > , Type111Users int > , Type123Users int > ) > PARTITIONED BY (dt string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS RCFILE > LOCATION '/hive//dest_Table_B'; > Step 4) Multi insert statement > from Table_A a > INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02') > select a.dt > ,count(distinct a.user) as AllDist > ,count(distinct case when a.type = 111 then a.user else null end) as > Type111User > ,count(distinct case when a.type != 111 then a.user else null end) as > Type123User > group by a.dt > > INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02') > select a.dt > ,count(distinct a.user) as AllDist > ,count(distinct case when a.type = 111 then a.user else null end) as > Type111User > ,count(distinct case when a.type != 111 then a.user else null end) as > Type123User > group by a.dt > ; > > Step 5) Verify results. > hive> select * from dest_table_a; > OK > 2013-12-02 2 1 1 2013-12-02 > Time taken: 0.116 seconds > hive> select * from dest_table_b; > OK > 2013-12-02 2 1 1 2013-12-02 > Time taken: 0.13 seconds > Conclusion: Hive gives a count of 2 for distinct users although there is > only one distinct user. After trying many datasets observed that Hive is > doing Type111Users + Typoe123Users = DistinctUsers which is wrong. > hive> select count(distinct a.user) from table_a a; > Gives: > Total MapReduce CPU Time Spent: 4 seconds 350 msec > OK > 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9397) SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
[ https://issues.apache.org/jira/browse/HIVE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9397: Attachment: HIVE-9397.3.patch.txt Updated result & fixed further more(distinct_stats was fall back to normal plan by exception making struct OI) > SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS > > > Key: HIVE-9397 > URL: https://issues.apache.org/jira/browse/HIVE-9397 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 0.14.0, 0.15.0 >Reporter: Damien Carol >Assignee: Navis > Attachments: HIVE-9397.1.patch.txt, HIVE-9397.2.patch.txt, > HIVE-9397.3.patch.txt > > > These queries produce an error : > {code:sql} > DROP TABLE IF EXISTS foo; > CREATE TABLE foo (id int) STORED AS ORC; > INSERT INTO TABLE foo VALUES (1); > INSERT INTO TABLE foo VALUES (2); > INSERT INTO TABLE foo VALUES (3); > INSERT INTO TABLE foo VALUES (4); > INSERT INTO TABLE foo VALUES (5); > SELECT max(id) FROM foo; > ANALYZE TABLE foo COMPUTE STATISTICS FOR COLUMNS id; > SELECT max(id) FROM foo; > {code} > The last query throws {{org.apache.hive.service.cli.HiveSQLException}} > {noformat} > 0: jdbc:hive2://nc-h04:1/casino> SELECT max(id) FROM foo; > +-+--+ > | _c0 | > +-+--+ > org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException > 0: jdbc:hive2://nc-h04:1/casino> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9566) HiveServer2 fails to start with NullPointerException
[ https://issues.apache.org/jira/browse/HIVE-9566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304392#comment-14304392 ] Navis commented on HIVE-9566: - +1 > HiveServer2 fails to start with NullPointerException > > > Key: HIVE-9566 > URL: https://issues.apache.org/jira/browse/HIVE-9566 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.13.0, 0.14.0, 0.13.1 >Reporter: Na Yang >Assignee: Na Yang > Attachments: HIVE-9566.patch > > > hiveserver2 uses embedded metastore with default hive-site.xml configuration. > I use "hive --stop --service hiveserver2" command to stop the running > hiveserver2 process and then use "hive --start --service hiveserver2" command > to start the hiveserver2 service. I see the following exception in the > hive.log file > {noformat} > java.lang.NullPointerException > at > org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:104) > at > org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:138) > at > org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:171) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > {noformat} > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9553) Fix log-line in Partition Pruner
[ https://issues.apache.org/jira/browse/HIVE-9553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9553: Resolution: Fixed Fix Version/s: 1.2.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Mithun Radhakrishnan. > Fix log-line in Partition Pruner > > > Key: HIVE-9553 > URL: https://issues.apache.org/jira/browse/HIVE-9553 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.14.0 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan >Priority: Trivial > Fix For: 1.2.0 > > Attachments: HIVE-9553.1.patch > > > Minor issue in logging the prune-expression in the PartitionPruner: > {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE} > LOG.trace("prune Expression = " + prunerExpr == null ? "" : prunerExpr); > {code} > Given the operator precedence order, this should read: > {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE} > LOG.trace("prune Expression = " + (prunerExpr == null ? "" : prunerExpr)); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM
[ https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9545: Attachment: HIVE-9545.1.patch.txt > Build FAILURE with IBM JVM > --- > > Key: HIVE-9545 > URL: https://issues.apache.org/jira/browse/HIVE-9545 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 > Environment: mvn -version > Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; > 2014-08-11T22:58:10+02:00) > Maven home: /opt/apache-maven-3.2.3 > Java version: 1.7.0, vendor: IBM Corporation > Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre > Default locale: en_US, platform encoding: ISO-8859-1 > OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", > family: "unix" >Reporter: pascal oliva >Assignee: Navis > Attachments: HIVE-9545.1.patch.txt > > > With the use of IBM JVM environment : > [root@dorado-vm2 hive]# java -version > java version "1.7.0" > Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2)) > IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References > 20141017_217728 (JIT enabled, AOT enabled). > The build failed on > [INFO] Hive Query Language FAILURE [ 50.053 > s] > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hive-exec: Compilation failure: Compilation failure: > [ERROR] > /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26] > package com.sun.management does not exist. > HOWTO : > #git clone -b branch-0.14 https://github.com/apache/hive.git > #cd hive > #mvn install -DskipTests -Phadoop-2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM
[ https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9545: Description: With the use of IBM JVM environment : [root@dorado-vm2 hive]# java -version java version "1.7.0" Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2)) IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 20141017_217728 (JIT enabled, AOT enabled). The build failed on [INFO] Hive Query Language FAILURE [ 50.053 s] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hive-exec: Compilation failure: Compilation failure: [ERROR] /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26] package com.sun.management does not exist. HOWTO : #git clone -b branch-0.14 https://github.com/apache/hive.git #cd hive #mvn install -DskipTests -Phadoop-2 was: With the use of IBM JVM environment : [root@dorado-vm2 hive]# java -version java version "1.7.0" Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2)) IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 20141017_217728 (JIT enabled, AOT enabled). The build failed on [INFO] Hive Query Language FAILURE [ 50.053 s] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hive-exec: Compilation failure: Compilation failure: [ERROR] /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26] package com.sun.management does not exist. HOWTO : #git clone -b branch-0.14 https://github.com/apache/hive.git #cd hive #mvn install -DskipTests -Phadoop-2 > Build FAILURE with IBM JVM > --- > > Key: HIVE-9545 > URL: https://issues.apache.org/jira/browse/HIVE-9545 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 > Environment: mvn -version > Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; > 2014-08-11T22:58:10+02:00) > Maven home: /opt/apache-maven-3.2.3 > Java version: 1.7.0, vendor: IBM Corporation > Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre > Default locale: en_US, platform encoding: ISO-8859-1 > OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", > family: "unix" >Reporter: pascal oliva >Assignee: Navis > Attachments: HIVE-9545.1.patch.txt > > > With the use of IBM JVM environment : > [root@dorado-vm2 hive]# java -version > java version "1.7.0" > Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2)) > IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References > 20141017_217728 (JIT enabled, AOT enabled). > The build failed on > [INFO] Hive Query Language FAILURE [ 50.053 > s] > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hive-exec: Compilation failure: Compilation failure: > [ERROR] > /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26] > package com.sun.management does not exist. > HOWTO : > #git clone -b branch-0.14 https://github.com/apache/hive.git > #cd hive > #mvn install -DskipTests -Phadoop-2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM
[ https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9545: Description: NO PRECOMMIT TESTS With the use of IBM JVM environment : [root@dorado-vm2 hive]# java -version java version "1.7.0" Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2)) IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 20141017_217728 (JIT enabled, AOT enabled). The build failed on [INFO] Hive Query Language FAILURE [ 50.053 s] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hive-exec: Compilation failure: Compilation failure: [ERROR] /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26] package com.sun.management does not exist. HOWTO : #git clone -b branch-0.14 https://github.com/apache/hive.git #cd hive #mvn install -DskipTests -Phadoop-2 was: With the use of IBM JVM environment : [root@dorado-vm2 hive]# java -version java version "1.7.0" Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2)) IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 20141017_217728 (JIT enabled, AOT enabled). The build failed on [INFO] Hive Query Language FAILURE [ 50.053 s] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hive-exec: Compilation failure: Compilation failure: [ERROR] /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26] package com.sun.management does not exist. HOWTO : #git clone -b branch-0.14 https://github.com/apache/hive.git #cd hive #mvn install -DskipTests -Phadoop-2 > Build FAILURE with IBM JVM > --- > > Key: HIVE-9545 > URL: https://issues.apache.org/jira/browse/HIVE-9545 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 > Environment: mvn -version > Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; > 2014-08-11T22:58:10+02:00) > Maven home: /opt/apache-maven-3.2.3 > Java version: 1.7.0, vendor: IBM Corporation > Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre > Default locale: en_US, platform encoding: ISO-8859-1 > OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", > family: "unix" >Reporter: pascal oliva >Assignee: Navis > Attachments: HIVE-9545.1.patch.txt > > > NO PRECOMMIT TESTS > With the use of IBM JVM environment : > [root@dorado-vm2 hive]# java -version > java version "1.7.0" > Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2)) > IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References > 20141017_217728 (JIT enabled, AOT enabled). > The build failed on > [INFO] Hive Query Language FAILURE [ 50.053 > s] > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hive-exec: Compilation failure: Compilation failure: > [ERROR] > /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26] > package com.sun.management does not exist. > HOWTO : > #git clone -b branch-0.14 https://github.com/apache/hive.git > #cd hive > #mvn install -DskipTests -Phadoop-2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM
[ https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9545: Attachment: (was: HIVE-9495.1.patch.txt) > Build FAILURE with IBM JVM > --- > > Key: HIVE-9545 > URL: https://issues.apache.org/jira/browse/HIVE-9545 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 > Environment: mvn -version > Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; > 2014-08-11T22:58:10+02:00) > Maven home: /opt/apache-maven-3.2.3 > Java version: 1.7.0, vendor: IBM Corporation > Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre > Default locale: en_US, platform encoding: ISO-8859-1 > OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", > family: "unix" >Reporter: pascal oliva >Assignee: Navis > Attachments: HIVE-9545.1.patch.txt > > > With the use of IBM JVM environment : > [root@dorado-vm2 hive]# java -version > java version "1.7.0" > Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2)) > IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References > 20141017_217728 (JIT enabled, AOT enabled). > The build failed on > [INFO] Hive Query Language FAILURE [ 50.053 > s] > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hive-exec: Compilation failure: Compilation failure: > [ERROR] > /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26] > package com.sun.management does not exist. > HOWTO : > #git clone -b branch-0.14 https://github.com/apache/hive.git > #cd hive > #mvn install -DskipTests -Phadoop-2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM
[ https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9545: Assignee: Navis Status: Patch Available (was: Open) > Build FAILURE with IBM JVM > --- > > Key: HIVE-9545 > URL: https://issues.apache.org/jira/browse/HIVE-9545 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 > Environment: mvn -version > Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; > 2014-08-11T22:58:10+02:00) > Maven home: /opt/apache-maven-3.2.3 > Java version: 1.7.0, vendor: IBM Corporation > Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre > Default locale: en_US, platform encoding: ISO-8859-1 > OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", > family: "unix" >Reporter: pascal oliva >Assignee: Navis > Attachments: HIVE-9495.1.patch.txt > > > With the use of IBM JVM environment : > [root@dorado-vm2 hive]# java -version > java version "1.7.0" > Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2)) > IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References > 20141017_217728 (JIT enabled, AOT enabled). > The build failed on > [INFO] Hive Query Language FAILURE [ 50.053 > s] > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hive-exec: Compilation failure: Compilation failure: > [ERROR] > /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26] > package com.sun.management does not exist. > HOWTO : > #git clone -b branch-0.14 https://github.com/apache/hive.git > #cd hive > #mvn install -DskipTests -Phadoop-2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9545) Build FAILURE with IBM JVM
[ https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9545: Attachment: HIVE-9495.1.patch.txt > Build FAILURE with IBM JVM > --- > > Key: HIVE-9545 > URL: https://issues.apache.org/jira/browse/HIVE-9545 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 > Environment: mvn -version > Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; > 2014-08-11T22:58:10+02:00) > Maven home: /opt/apache-maven-3.2.3 > Java version: 1.7.0, vendor: IBM Corporation > Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre > Default locale: en_US, platform encoding: ISO-8859-1 > OS name: "linux", version: "3.10.0-123.4.4.el7.x86_64", arch: "amd64", > family: "unix" >Reporter: pascal oliva > Attachments: HIVE-9495.1.patch.txt > > > With the use of IBM JVM environment : > [root@dorado-vm2 hive]# java -version > java version "1.7.0" > Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2)) > IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References > 20141017_217728 (JIT enabled, AOT enabled). > The build failed on > [INFO] Hive Query Language FAILURE [ 50.053 > s] > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hive-exec: Compilation failure: Compilation failure: > [ERROR] > /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26] > package com.sun.management does not exist. > HOWTO : > #git clone -b branch-0.14 https://github.com/apache/hive.git > #cd hive > #mvn install -DskipTests -Phadoop-2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9397) SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
[ https://issues.apache.org/jira/browse/HIVE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302848#comment-14302848 ] Navis commented on HIVE-9397: - Now OIs are acquired directly from row schema of final GBY operator. And also I've fixed double to float type casting, making identical result between stat-optimized and not. It would be possible to extend StatsOptimizer to accept queries like "select min(x)+max(x) from tbl" but seemed better to be done in following issue. > SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS > > > Key: HIVE-9397 > URL: https://issues.apache.org/jira/browse/HIVE-9397 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 0.14.0, 0.15.0 >Reporter: Damien Carol >Assignee: Navis > Attachments: HIVE-9397.1.patch.txt, HIVE-9397.2.patch.txt > > > These queries produce an error : > {code:sql} > DROP TABLE IF EXISTS foo; > CREATE TABLE foo (id int) STORED AS ORC; > INSERT INTO TABLE foo VALUES (1); > INSERT INTO TABLE foo VALUES (2); > INSERT INTO TABLE foo VALUES (3); > INSERT INTO TABLE foo VALUES (4); > INSERT INTO TABLE foo VALUES (5); > SELECT max(id) FROM foo; > ANALYZE TABLE foo COMPUTE STATISTICS FOR COLUMNS id; > SELECT max(id) FROM foo; > {code} > The last query throws {{org.apache.hive.service.cli.HiveSQLException}} > {noformat} > 0: jdbc:hive2://nc-h04:1/casino> SELECT max(id) FROM foo; > +-+--+ > | _c0 | > +-+--+ > org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException > 0: jdbc:hive2://nc-h04:1/casino> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9397) SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS
[ https://issues.apache.org/jira/browse/HIVE-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9397: Attachment: HIVE-9397.2.patch.txt Addressed comments & fixed double sub-type > SELECT max(bar) FROM foo is broken after ANALYZE ... FOR COLUMNS > > > Key: HIVE-9397 > URL: https://issues.apache.org/jira/browse/HIVE-9397 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 0.14.0, 0.15.0 >Reporter: Damien Carol >Assignee: Navis > Attachments: HIVE-9397.1.patch.txt, HIVE-9397.2.patch.txt > > > These queries produce an error : > {code:sql} > DROP TABLE IF EXISTS foo; > CREATE TABLE foo (id int) STORED AS ORC; > INSERT INTO TABLE foo VALUES (1); > INSERT INTO TABLE foo VALUES (2); > INSERT INTO TABLE foo VALUES (3); > INSERT INTO TABLE foo VALUES (4); > INSERT INTO TABLE foo VALUES (5); > SELECT max(id) FROM foo; > ANALYZE TABLE foo COMPUTE STATISTICS FOR COLUMNS id; > SELECT max(id) FROM foo; > {code} > The last query throws {{org.apache.hive.service.cli.HiveSQLException}} > {noformat} > 0: jdbc:hive2://nc-h04:1/casino> SELECT max(id) FROM foo; > +-+--+ > | _c0 | > +-+--+ > org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException > 0: jdbc:hive2://nc-h04:1/casino> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-9528) SemanticException: Ambiguous column reference
[ https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis resolved HIVE-9528. - Resolution: Not a Problem > SemanticException: Ambiguous column reference > - > > Key: HIVE-9528 > URL: https://issues.apache.org/jira/browse/HIVE-9528 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Yongzhi Chen >Assignee: Navis > > When running the following query: > {code} > SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select * from > sim a join sim2 b on a.simstr=b.simstr) app > Error: Error while compiling statement: FAILED: SemanticException [Error > 10007]: Ambiguous column reference simstr in app (state=42000,code=10007) > {code} > This query works fine in hive 0.10 > In the apache trunk, following workaround will work: > {code} > SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim > a join sim2 b on a.simstr=b.simstr) app; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9528) SemanticException: Ambiguous column reference
[ https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302768#comment-14302768 ] Navis commented on HIVE-9528: - No, it's HIVE-7733. I've almost forgot the context of it but probably it was about enforcing unique column names in the final stage of subquery which was checked when generating select operator before of it. > SemanticException: Ambiguous column reference > - > > Key: HIVE-9528 > URL: https://issues.apache.org/jira/browse/HIVE-9528 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Yongzhi Chen >Assignee: Navis > > When running the following query: > {code} > SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select * from > sim a join sim2 b on a.simstr=b.simstr) app > Error: Error while compiling statement: FAILED: SemanticException [Error > 10007]: Ambiguous column reference simstr in app (state=42000,code=10007) > {code} > This query works fine in hive 0.10 > In the apache trunk, following workaround will work: > {code} > SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim > a join sim2 b on a.simstr=b.simstr) app; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9553) Fix log-line in Partition Pruner
[ https://issues.apache.org/jira/browse/HIVE-9553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302756#comment-14302756 ] Navis commented on HIVE-9553: - +1 > Fix log-line in Partition Pruner > > > Key: HIVE-9553 > URL: https://issues.apache.org/jira/browse/HIVE-9553 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.14.0 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan >Priority: Trivial > Attachments: HIVE-9553.1.patch > > > Minor issue in logging the prune-expression in the PartitionPruner: > {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE} > LOG.trace("prune Expression = " + prunerExpr == null ? "" : prunerExpr); > {code} > Given the operator precedence order, this should read: > {code:title=PartitionPruner.java|titleBGColor=#F7D6C1|bgColor=#CE} > LOG.trace("prune Expression = " + (prunerExpr == null ? "" : prunerExpr)); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count
[ https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302613#comment-14302613 ] Navis commented on HIVE-6099: - [~ashutoshc] Could we remove this optimization? I'm sure this is not valid from the start. > Multi insert does not work properly with distinct count > --- > > Key: HIVE-6099 > URL: https://issues.apache.org/jira/browse/HIVE-6099 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0 >Reporter: Pavan Gadam Manohar >Assignee: Navis > Labels: count, distinct, insert, multi-insert > Attachments: explain_hive_0.10.0.txt, with_disabled.txt, > with_enabled.txt > > > Need 2 rows to reproduce this Bug. Here are the steps. > Step 1) Create a table Table_A > CREATE EXTERNAL TABLE Table_A > ( > user string > , type int > ) > PARTITIONED BY (dt string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS RCFILE > LOCATION '/hive//Table_A'; > Step 2) Scenario: Lets us say consider user tommy belong to both usertypes > 111 and 123. Insert 2 records into the table created above. > select * from Table_A; > hive> select * from table_a; > OK > tommy 123 2013-12-02 > tommy 111 2013-12-02 > Step 3) Create 2 destination tables to simulate multi-insert. > CREATE EXTERNAL TABLE dest_Table_A > ( > p_date string > , Distinct_Users int > , Type111Users int > , Type123Users int > ) > PARTITIONED BY (dt string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS RCFILE > LOCATION '/hive//dest_Table_A'; > > CREATE EXTERNAL TABLE dest_Table_B > ( > p_date string > , Distinct_Users int > , Type111Users int > , Type123Users int > ) > PARTITIONED BY (dt string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS RCFILE > LOCATION '/hive//dest_Table_B'; > Step 4) Multi insert statement > from Table_A a > INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02') > select a.dt > ,count(distinct a.user) as AllDist > ,count(distinct case when a.type = 111 then a.user else null end) as > Type111User > ,count(distinct case when a.type != 111 then a.user else null end) as > Type123User > group by a.dt > > INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02') > select a.dt > ,count(distinct a.user) as AllDist > ,count(distinct case when a.type = 111 then a.user else null end) as > Type111User > ,count(distinct case when a.type != 111 then a.user else null end) as > Type123User > group by a.dt > ; > > Step 5) Verify results. > hive> select * from dest_table_a; > OK > 2013-12-02 2 1 1 2013-12-02 > Time taken: 0.116 seconds > hive> select * from dest_table_b; > OK > 2013-12-02 2 1 1 2013-12-02 > Time taken: 0.13 seconds > Conclusion: Hive gives a count of 2 for distinct users although there is > only one distinct user. After trying many datasets observed that Hive is > doing Type111Users + Typoe123Users = DistinctUsers which is wrong. > hive> select count(distinct a.user) from table_a a; > Gives: > Total MapReduce CPU Time Spent: 4 seconds 350 msec > OK > 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9416) Get rid of Extract Operator
[ https://issues.apache.org/jira/browse/HIVE-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14300801#comment-14300801 ] Navis commented on HIVE-9416: - +1 > Get rid of Extract Operator > --- > > Key: HIVE-9416 > URL: https://issues.apache.org/jira/browse/HIVE-9416 > Project: Hive > Issue Type: Task > Components: Query Processor >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-9416.1.patch, HIVE-9416.2.patch, HIVE-9416.3.patch, > HIVE-9416.4.patch, HIVE-9416.5.patch, HIVE-9416.6.patch, HIVE-9416.7.patch, > HIVE-9416.patch > > > {{Extract Operator}} has been there for legacy reasons. But there is no > functionality it provides which cant be provided by {{Select Operator}} > Instead of having two operators, one being subset of another we should just > get rid of {{Extract}} and simplify our codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9528) SemanticException: Ambiguous column reference
[ https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis reassigned HIVE-9528: --- Assignee: Navis > SemanticException: Ambiguous column reference > - > > Key: HIVE-9528 > URL: https://issues.apache.org/jira/browse/HIVE-9528 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Yongzhi Chen >Assignee: Navis > > When running the following query: > {code} > SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select * from > sim a join sim2 b on a.simstr=b.simstr) app > Error: Error while compiling statement: FAILED: SemanticException [Error > 10007]: Ambiguous column reference simstr in app (state=42000,code=10007) > {code} > This query works fine in hive 0.10 > In the apache trunk, following workaround will work: > {code} > SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim > a join sim2 b on a.simstr=b.simstr) app; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9528) SemanticException: Ambiguous column reference
[ https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14300800#comment-14300800 ] Navis commented on HIVE-9528: - [~ychena], before HIVE-7733, column information was overwritten by last column with same name, which possibly making invalid result. Anyway, the query you've mentioned is not working in mysql either (works in psql, though). Can we resolve this as a not-problem? > SemanticException: Ambiguous column reference > - > > Key: HIVE-9528 > URL: https://issues.apache.org/jira/browse/HIVE-9528 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Yongzhi Chen > > When running the following query: > {code} > SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select * from > sim a join sim2 b on a.simstr=b.simstr) app > Error: Error while compiling statement: FAILED: SemanticException [Error > 10007]: Ambiguous column reference simstr in app (state=42000,code=10007) > {code} > This query works fine in hive 0.10 > In the apache trunk, following workaround will work: > {code} > SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim > a join sim2 b on a.simstr=b.simstr) app; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9499: Attachment: HIVE-9499.2.patch.txt fixed trivial bug in TableScanStatsRule > hive.limit.query.max.table.partition makes queries fail on non-partitioned > tables > - > > Key: HIVE-9499 > URL: https://issues.apache.org/jira/browse/HIVE-9499 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Alexander Kasper >Assignee: Navis > Attachments: HIVE-9499.1.patch.txt, HIVE-9499.2.patch.txt > > > If you use hive.limit.query.max.table.partition to limit the amount of > partitions that can be queried it makes queries on non-partitioned tables > fail. > Example: > {noformat} > CREATE TABLE tmp(test INT); > SELECT COUNT(*) FROM TMP; -- works fine > SET hive.limit.query.max.table.partition=20; > SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null) > SET hive.limit.query.max.table.partition=-1; > SELECT COUNT(*) FROM TMP; -- works fine again > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9499: Assignee: Navis Status: Patch Available (was: Open) > hive.limit.query.max.table.partition makes queries fail on non-partitioned > tables > - > > Key: HIVE-9499 > URL: https://issues.apache.org/jira/browse/HIVE-9499 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Alexander Kasper >Assignee: Navis > Attachments: HIVE-9499.1.patch.txt > > > If you use hive.limit.query.max.table.partition to limit the amount of > partitions that can be queried it makes queries on non-partitioned tables > fail. > Example: > {noformat} > CREATE TABLE tmp(test INT); > SELECT COUNT(*) FROM TMP; -- works fine > SET hive.limit.query.max.table.partition=20; > SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null) > SET hive.limit.query.max.table.partition=-1; > SELECT COUNT(*) FROM TMP; -- works fine again > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9499: Attachment: HIVE-9499.1.patch.txt > hive.limit.query.max.table.partition makes queries fail on non-partitioned > tables > - > > Key: HIVE-9499 > URL: https://issues.apache.org/jira/browse/HIVE-9499 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Alexander Kasper > Attachments: HIVE-9499.1.patch.txt > > > If you use hive.limit.query.max.table.partition to limit the amount of > partitions that can be queried it makes queries on non-partitioned tables > fail. > Example: > {noformat} > CREATE TABLE tmp(test INT); > SELECT COUNT(*) FROM TMP; -- works fine > SET hive.limit.query.max.table.partition=20; > SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null) > SET hive.limit.query.max.table.partition=-1; > SELECT COUNT(*) FROM TMP; -- works fine again > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9499: Description: If you use hive.limit.query.max.table.partition to limit the amount of partitions that can be queried it makes queries on non-partitioned tables fail. Example: {noformat} CREATE TABLE tmp(test INT); SELECT COUNT(*) FROM TMP; -- works fine SET hive.limit.query.max.table.partition=20; SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null) SET hive.limit.query.max.table.partition=-1; SELECT COUNT(*) FROM TMP; -- works fine again {noformat} was: If you use hive.limit.query.max.table.partition to limit the amount of partitions that can be queried it makes queries on non-partitioned tables fail. Example: CREATE TABLE tmp(test INT); SELECT COUNT(*) FROM TMP; -- works fine SET hive.limit.query.max.table.partition=20; SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null) SET hive.limit.query.max.table.partition=-1; SELECT COUNT(*) FROM TMP; -- works fine again > hive.limit.query.max.table.partition makes queries fail on non-partitioned > tables > - > > Key: HIVE-9499 > URL: https://issues.apache.org/jira/browse/HIVE-9499 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Alexander Kasper > > If you use hive.limit.query.max.table.partition to limit the amount of > partitions that can be queried it makes queries on non-partitioned tables > fail. > Example: > {noformat} > CREATE TABLE tmp(test INT); > SELECT COUNT(*) FROM TMP; -- works fine > SET hive.limit.query.max.table.partition=20; > SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null) > SET hive.limit.query.max.table.partition=-1; > SELECT COUNT(*) FROM TMP; -- works fine again > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9513) NULL POINTER EXCEPTION
[ https://issues.apache.org/jira/browse/HIVE-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9513: Attachment: HIVE-9513.1.patch.txt > NULL POINTER EXCEPTION > -- > > Key: HIVE-9513 > URL: https://issues.apache.org/jira/browse/HIVE-9513 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 0.13.1 >Reporter: ErwanMAS > Attachments: HIVE-9513.1.patch.txt > > > NPE duting parsing of : > {noformat} > select * from ( > select * from ( select 1 as id , "foo" as str_1 from staging.dual ) f > union all > select * from ( select 2 as id , "bar" as str_2 from staging.dual ) g > ) e ; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9513) NULL POINTER EXCEPTION
[ https://issues.apache.org/jira/browse/HIVE-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9513: Assignee: Navis Status: Patch Available (was: Open) > NULL POINTER EXCEPTION > -- > > Key: HIVE-9513 > URL: https://issues.apache.org/jira/browse/HIVE-9513 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 0.13.1 >Reporter: ErwanMAS >Assignee: Navis > Attachments: HIVE-9513.1.patch.txt > > > NPE duting parsing of : > {noformat} > select * from ( > select * from ( select 1 as id , "foo" as str_1 from staging.dual ) f > union all > select * from ( select 2 as id , "bar" as str_2 from staging.dual ) g > ) e ; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
[ https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9507: Assignee: Navis Status: Patch Available (was: Open) > Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls > > > Key: HIVE-9507 > URL: https://issues.apache.org/jira/browse/HIVE-9507 > Project: Hive > Issue Type: Bug > Components: Query Processor, UDF >Affects Versions: 0.14.0 > Environment: hdp 2.2 > Windows server 2012 R2 64-bit >Reporter: Moustafa Aboul Atta >Assignee: Navis > Attachments: HIVE-9507.1.patch.txt > > > I have tweets stored with avro on hdfs with the default twitter status > (tweet) schema. > There's an object called "entities" that contains arrays of structs. > When I run > > {{SELECT mytable.*}} > {{FROM tweets}} > {{LATERAL VIEW INLINE(entities.media) mytable}} > I get the exception found hereunder, however if I add > {{WHERE entities.media IS NOT NULL}} > it runs perfectly. > Here's the partial log: > 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) > - Status: Running (Executing on YARN cluster with App id > application_1422267635031_0618) > 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: -/- > 2015-01-29 10:15:02,526 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0/13 > 2015-01-29 10:15:05,551 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0/13 > 2015-01-29 10:15:08,722 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0/13 > 2015-01-29 10:15:12,095 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0/13 > 2015-01-29 10:15:12,354 INFO log.PerfLogger > (PerfLogger.java:PerfLogBegin(108)) - from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor> > 2015-01-29 10:15:12,354 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+5)/13 > 2015-01-29 10:15:12,557 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6)/13 > 2015-01-29 10:15:15,691 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6)/13 > 2015-01-29 10:15:18,892 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-1)/13 > 2015-01-29 10:15:19,094 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-3)/13 > 2015-01-29 10:15:19,304 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-5)/13 > 2015-01-29 10:15:19,507 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-6)/13 > 2015-01-29 10:15:22,641 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-6)/13 > 2015-01-29 10:15:24,704 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-8)/13 > 2015-01-29 10:15:27,735 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-8)/13 > 2015-01-29 10:15:30,957 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-8)/13 > 2015-01-29 10:15:34,095 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-8)/13 > 2015-01-29 10:15:35,138 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-9)/13 > 2015-01-29 10:15:36,503 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-10)/13 > 2015-01-29 10:15:36,710 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-11)/13 > 2015-01-29 10:15:37,971 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-12)/13 > 2015-01-29 10:15:39,800 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-13)/13 > 2015-01-29 10:15:41,175 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-14)/13 > 2015-01-29 10:15:44,414 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-14)/13 > 2015-01-29 10:15:45,447 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-15)/13 > 2015-01-29 10:15:47,413 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-16)/13 > 2015-01-29 10:15:47,618 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-17)/13 > 2015-01-29 10:15:49,568 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-18)/13 > 2015-01-29 10:15:51,099 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+0,-19)/13 > 2015-01-29 10:15:51,331 ERROR SessionState > (SessionState.java:printError(833)) - Status: Failed > 2015-01-29 10:15:51,417 ERROR SessionState > (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, > vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, > taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 > failed, info=[
[jira] [Updated] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
[ https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9507: Attachment: HIVE-9507.1.patch.txt > Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls > > > Key: HIVE-9507 > URL: https://issues.apache.org/jira/browse/HIVE-9507 > Project: Hive > Issue Type: Bug > Components: Query Processor, UDF >Affects Versions: 0.14.0 > Environment: hdp 2.2 > Windows server 2012 R2 64-bit >Reporter: Moustafa Aboul Atta > Attachments: HIVE-9507.1.patch.txt > > > I have tweets stored with avro on hdfs with the default twitter status > (tweet) schema. > There's an object called "entities" that contains arrays of structs. > When I run > > {{SELECT mytable.*}} > {{FROM tweets}} > {{LATERAL VIEW INLINE(entities.media) mytable}} > I get the exception found hereunder, however if I add > {{WHERE entities.media IS NOT NULL}} > it runs perfectly. > Here's the partial log: > 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) > - Status: Running (Executing on YARN cluster with App id > application_1422267635031_0618) > 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: -/- > 2015-01-29 10:15:02,526 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0/13 > 2015-01-29 10:15:05,551 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0/13 > 2015-01-29 10:15:08,722 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0/13 > 2015-01-29 10:15:12,095 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0/13 > 2015-01-29 10:15:12,354 INFO log.PerfLogger > (PerfLogger.java:PerfLogBegin(108)) - from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor> > 2015-01-29 10:15:12,354 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+5)/13 > 2015-01-29 10:15:12,557 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6)/13 > 2015-01-29 10:15:15,691 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6)/13 > 2015-01-29 10:15:18,892 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-1)/13 > 2015-01-29 10:15:19,094 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-3)/13 > 2015-01-29 10:15:19,304 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-5)/13 > 2015-01-29 10:15:19,507 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-6)/13 > 2015-01-29 10:15:22,641 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-6)/13 > 2015-01-29 10:15:24,704 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-8)/13 > 2015-01-29 10:15:27,735 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-8)/13 > 2015-01-29 10:15:30,957 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-8)/13 > 2015-01-29 10:15:34,095 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-8)/13 > 2015-01-29 10:15:35,138 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-9)/13 > 2015-01-29 10:15:36,503 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-10)/13 > 2015-01-29 10:15:36,710 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-11)/13 > 2015-01-29 10:15:37,971 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-12)/13 > 2015-01-29 10:15:39,800 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-13)/13 > 2015-01-29 10:15:41,175 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-14)/13 > 2015-01-29 10:15:44,414 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-14)/13 > 2015-01-29 10:15:45,447 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-15)/13 > 2015-01-29 10:15:47,413 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-16)/13 > 2015-01-29 10:15:47,618 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-17)/13 > 2015-01-29 10:15:49,568 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+6,-18)/13 > 2015-01-29 10:15:51,099 INFO SessionState (SessionState.java:printInfo(824)) > - Map 1: 0(+0,-19)/13 > 2015-01-29 10:15:51,331 ERROR SessionState > (SessionState.java:printError(833)) - Status: Failed > 2015-01-29 10:15:51,417 ERROR SessionState > (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, > vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, > taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 > failed, info=[Error: Failure while running task:java.lang.RuntimeExc
[jira] [Updated] (HIVE-9486) Use session classloader instead of application loader
[ https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9486: Attachment: HIVE-9486.2.patch.txt > Use session classloader instead of application loader > - > > Key: HIVE-9486 > URL: https://issues.apache.org/jira/browse/HIVE-9486 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: HIVE-9486.1.patch.txt, HIVE-9486.2.patch.txt > > > From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html > Looks reasonable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9228) Problem with subquery using windowing functions
[ https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14294595#comment-14294595 ] Navis commented on HIVE-9228: - Yes, when PTF column is not selected, we should prune the function itself in PTF operator. But I thought it's trivial case not to select the column which was calculated with heavy cost. And select operator would be removed by IdentityProjectRemover if it's not needed. By the way, could you review HIVE-9138 first? It's hard to debug something on PTF without any explain result. > Problem with subquery using windowing functions > --- > > Key: HIVE-9228 > URL: https://issues.apache.org/jira/browse/HIVE-9228 > Project: Hive > Issue Type: Bug > Components: PTF-Windowing >Affects Versions: 0.13.1 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, > create_table_tab1.sql, tab1.csv > > Original Estimate: 96h > Remaining Estimate: 96h > > The following query with window functions failed. The internal query works > fine. > select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 > then 1 end ) over (partition by col1, col2) as col5, row_number() over > (partition by col1, col2 order by col4) as col6 from tab1) t; > HIVE generates an execution plan with 2 jobs. > 1. The first job is to basically calculate window function for col5. > 2. The second job is to calculate window function for col6 and output. > The plan says the first job outputs the columns (col1, col2, col3, col4) to a > tmp file since only these columns are used in later stage. While, the PTF > operator for the first job outputs (_wcol0, col1, col2, col3, col4) with > _wcol0 as the result of the window function even it's not used. > In the second job, the map operator still reads the 4 columns (col1, col2, > col3, col4) from the temp file using the plan. That causes the exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9278) Cached expression feature broken in one case
[ https://issues.apache.org/jira/browse/HIVE-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9278: Fix Version/s: 0.14.1 > Cached expression feature broken in one case > > > Key: HIVE-9278 > URL: https://issues.apache.org/jira/browse/HIVE-9278 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.14.0 >Reporter: Matt McCline >Assignee: Navis >Priority: Blocker > Fix For: 0.15.0, 0.14.1, 1.0.0 > > Attachments: HIVE-9278.1.patch.txt > > > Different query result depending on whether hive.cache.expr.evaluation is > true or false. When true, no query results are produced (this is wrong). > The q file: > {noformat} > set hive.cache.expr.evaluation=true; > CREATE TABLE cache_expr_repro (date_str STRING); > LOAD DATA LOCAL INPATH '../../data/files/cache_expr_repro.txt' INTO TABLE > cache_expr_repro; > SELECT MONTH(date_str) AS `mon`, CAST((MONTH(date_str) - 1) / 3 + 1 AS int) > AS `quarter`, YEAR(date_str) AS `year` FROM cache_expr_repro WHERE > ((CAST((MONTH(date_str) - 1) / 3 + 1 AS int) = 1) AND (YEAR(date_str) = > 2015)) GROUP BY MONTH(date_str), CAST((MONTH(date_str) - 1) / 3 + 1 AS int), > YEAR(date_str) ; > {noformat} > cache_expr_repro.txt > {noformat} > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-01-01 00:00:00 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9459) Concat plus date functions appear to be broken in 0.14
[ https://issues.apache.org/jira/browse/HIVE-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14294584#comment-14294584 ] Navis commented on HIVE-9459: - [~jdere] Yes, looks like same issue. > Concat plus date functions appear to be broken in 0.14 > -- > > Key: HIVE-9459 > URL: https://issues.apache.org/jira/browse/HIVE-9459 > Project: Hive > Issue Type: Bug >Reporter: Nathan Lande > > In the below example I create year_month and month_year vars. These each > should be mm and mm integer strings but it appears as if hive is > calling the first function twice such that it is returning and . > hive> select > > month(a.joined) month, > > year(a.joined) year, > > concat(cast(year(a.joined) as string),cast(month(a.joined) as string)) > year_month, > > concat(cast(month(a.joined) as string),cast(year(a.joined) as string)) > month_year > > from a limit 20; > OK > month yearyear_month month_year > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > 7 20142014201477 > Time taken: 0.109 seconds, Fetched: 20 row(s) > Other users appear to experience similar issues in this stack overflow: > http://stackoverflow.com/questions/27740866/convert-date-to-decimal-format-in-hive > . > I tested this in 0.13 and 0.14 and it does not appear to be an issue in 0.13. > I looked around and could not find a similar issue so hopefully this is not a > duplicate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9486) Use session classloader instead of application loader
[ https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9486: Status: Patch Available (was: Open) > Use session classloader instead of application loader > - > > Key: HIVE-9486 > URL: https://issues.apache.org/jira/browse/HIVE-9486 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: HIVE-9486.1.patch.txt > > > From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html > Looks reasonable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9486) Use session classloader instead of application loader
[ https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9486: Attachment: HIVE-9486.1.patch.txt > Use session classloader instead of application loader > - > > Key: HIVE-9486 > URL: https://issues.apache.org/jira/browse/HIVE-9486 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: HIVE-9486.1.patch.txt > > > From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html > Looks reasonable -- This message was sent by Atlassian JIRA (v6.3.4#6332)