[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance
[ https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9495: Attachment: HIVE-9495.2.patch.txt Best effort to estimate ndv for hash aggregation Map Side aggregation affecting map performance -- Key: HIVE-9495 URL: https://issues.apache.org/jira/browse/HIVE-9495 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Environment: RHEL 6.4 Hortonworks Hadoop 2.2 Reporter: Anand Sridharan Attachments: HIVE-9495.1.patch.txt, HIVE-9495.2.patch.txt, profiler_screenshot.PNG When trying to run a simple aggregation query with hive.map.aggr=true, map tasks take a lot of time in Hive 0.14 as against with hive.map.aggr=false. e.g. Consider the query: {code} INSERT OVERWRITE TABLE lineitem_tgt_agg select alias.a0 as a0, alias.a2 as a1, alias.a1 as a2, alias.a3 as a3, alias.a4 as a4 from ( select alias.a0 as a0, SUM(alias.a1) as a1, SUM(alias.a2) as a2, SUM(alias.a3) as a3, SUM(alias.a4) as a4 from ( select lineitem_sf500.l_orderkey as a0, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1, lineitem_sf500.l_quantity as a2, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * lineitem_sf500.l_discount as double) as a3, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * lineitem_sf500.l_tax as double) as a4 from lineitem_sf500 ) alias group by alias.a0 ) alias; {code} The above query was run with ~376GB of data / ~3billion records in the source. It takes ~10 minutes with hive.map.aggr=false. With map side aggregation set to true, the map tasks don't complete even after an hour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10816) NPE in ExecDriver::handleSampling when submitted via child JVM
[ https://issues.apache.org/jira/browse/HIVE-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580013#comment-14580013 ] Navis commented on HIVE-10816: -- [~lirui] I don't know why I've not been notified but here is my late +1 NPE in ExecDriver::handleSampling when submitted via child JVM -- Key: HIVE-10816 URL: https://issues.apache.org/jira/browse/HIVE-10816 Project: Hive Issue Type: Bug Reporter: Rui Li Assignee: Rui Li Fix For: 1.3.0, 2.0.0 Attachments: HIVE-10816.1.patch, HIVE-10816.1.patch When {{hive.exec.submitviachild = true}}, parallel order by fails with NPE and falls back to single-reducer mode. Stack trace: {noformat} 2015-05-25 08:41:04,446 ERROR [main]: mr.ExecDriver (ExecDriver.java:execute(386)) - Sampling error java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.handleSampling(ExecDriver.java:513) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:379) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:750) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10890) Provide implementable engine selector
[ https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580028#comment-14580028 ] Navis commented on HIVE-10890: -- Right, that should also be checked. Included implementation was just for showing the intention. I'll think of a way to know the engine is configured properly. Anyway, I don't know why I'm not notified these days from hive community. Provide implementable engine selector - Key: HIVE-10890 URL: https://issues.apache.org/jira/browse/HIVE-10890 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Now hive supports three kind of engines. It would be good to have an automatic engine selector without setting explicitly engine for execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11002) Memory leakage on unsafe aggregation path with empty input
[ https://issues.apache.org/jira/browse/HIVE-11002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis resolved HIVE-11002. -- Resolution: Invalid Sorry, this meant to be in spark. Memory leakage on unsafe aggregation path with empty input -- Key: HIVE-11002 URL: https://issues.apache.org/jira/browse/HIVE-11002 Project: Hive Issue Type: Bug Components: SQL Reporter: Navis Assignee: Navis Priority: Minor Currently, unsafe-based hash is released on 'next' call but if input is empty, it would not be called ever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11518) Provide interface to adjust required resource for tez tasks
[ https://issues.apache.org/jira/browse/HIVE-11518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706695#comment-14706695 ] Navis commented on HIVE-11518: -- [~hagleitn] Any interest on this? I could have assigned 4G for 20+ join map tasks by just assigning 1G for other simple tasks. Provide interface to adjust required resource for tez tasks --- Key: HIVE-11518 URL: https://issues.apache.org/jira/browse/HIVE-11518 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-11518.1.patch.txt Resource requirements for each tasks are varied but currently it's fixed to one value(via hive.tez.container.size). It would be good to customize resource requirements appropriate to expected work. Suggested interface is quite simple. {code} public interface ResourceCalculator { Resource adjust(Resource resource, MapWork mapWork); Resource adjust(Resource resource, ReduceWork reduceWork); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner
[ https://issues.apache.org/jira/browse/HIVE-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706694#comment-14706694 ] Navis commented on HIVE-11515: -- [~sseth] Sorry for delay. I've seen this a month ago in PoC scenario sometimes and desperately made this patch in a hurry. After applying it, those things gone and I just forget it (there was so many issues). So I cannot remember what was the exact problem, but it seemed query hang situation I guess. Sorry for my vague description. Still some possible race condition in DynamicPartitionPruner Key: HIVE-11515 URL: https://issues.apache.org/jira/browse/HIVE-11515 Project: Hive Issue Type: Bug Components: Query Processor, Tez Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-11515.1.patch.txt Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to reproduce but it seemed related to the fact that prune() is called by thread-pool. With some delay in queue, events from fast tasks are arrived before prune() is called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10890) Provide implementable engine selector
[ https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708627#comment-14708627 ] Navis commented on HIVE-10890: -- [~nemon] It's named as selector but you can implement more sophisticated strategy also in it. Provide implementable engine selector - Key: HIVE-10890 URL: https://issues.apache.org/jira/browse/HIVE-10890 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Now hive supports three kind of engines. It would be good to have an automatic engine selector without setting explicitly engine for execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10890) Provide implementable engine selector
[ https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-10890: - Attachment: HIVE-10890.1.patch.txt Provide implementable engine selector - Key: HIVE-10890 URL: https://issues.apache.org/jira/browse/HIVE-10890 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-10890.1.patch.txt Now hive supports three kind of engines. It would be good to have an automatic engine selector without setting explicitly engine for execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8319) Add configuration for custom services in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8319: Attachment: HIVE-8319.4.patch.txt Add configuration for custom services in hiveserver2 Key: HIVE-8319 URL: https://issues.apache.org/jira/browse/HIVE-8319 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8319.1.patch.txt, HIVE-8319.2.patch.txt, HIVE-8319.3.patch.txt, HIVE-8319.4.patch.txt NO PRECOMMIT TESTS Register services to hiveserver2, for example, {noformat} property namehive.server2.service.classesname valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue /property property nameazkaban.ssl.portname name...name /property {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8319) Add configuration for custom services in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708669#comment-14708669 ] Navis commented on HIVE-8319: - [~thejas] Do you still have an interest to get this into hive? Add configuration for custom services in hiveserver2 Key: HIVE-8319 URL: https://issues.apache.org/jira/browse/HIVE-8319 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8319.1.patch.txt, HIVE-8319.2.patch.txt, HIVE-8319.3.patch.txt, HIVE-8319.4.patch.txt NO PRECOMMIT TESTS Register services to hiveserver2, for example, {noformat} property namehive.server2.service.classesname valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue /property property nameazkaban.ssl.portname name...name /property {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8319) Add configuration for custom services in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8319: Attachment: HIVE-8319.4.patch.txt Add configuration for custom services in hiveserver2 Key: HIVE-8319 URL: https://issues.apache.org/jira/browse/HIVE-8319 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8319.1.patch.txt, HIVE-8319.2.patch.txt, HIVE-8319.3.patch.txt, HIVE-8319.4.patch.txt NO PRECOMMIT TESTS Register services to hiveserver2, for example, {noformat} property namehive.server2.service.classesname valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue /property property nameazkaban.ssl.portname name...name /property {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11518) Provide interface to adjust required resource for tez tasks
[ https://issues.apache.org/jira/browse/HIVE-11518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11518: - Attachment: HIVE-11518.1.patch.txt Provide interface to adjust required resource for tez tasks --- Key: HIVE-11518 URL: https://issues.apache.org/jira/browse/HIVE-11518 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-11518.1.patch.txt Resource requirements for each tasks are varied but currently it's fixed to one value(via hive.tez.container.size). It would be good to customize resource requirements appropriate to expected work. Suggested interface is quite simple. {code} public interface ResourceCalculator { Resource adjust(Resource resource, MapWork mapWork); Resource adjust(Resource resource, ReduceWork reduceWork); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner
[ https://issues.apache.org/jira/browse/HIVE-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11515: - Attachment: HIVE-11515.1.patch.txt Still some possible race condition in DynamicPartitionPruner Key: HIVE-11515 URL: https://issues.apache.org/jira/browse/HIVE-11515 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-11515.1.patch.txt Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to reproduce but it seemed related to the fact that prune() is called by thread-pool. With some delay in queue, events from fast tasks are arrived before prune() is called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner
[ https://issues.apache.org/jira/browse/HIVE-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11515: - Description: Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to reproduce but it seemed related to the fact that prune() is called by thread-pool. With some delay in queue, events from fast tasks are arrived before prune() is called. (was: Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to reproduce but it seemed related to the fact that init() is called by thread-pool. With some delay in queue, events from fast tasks are arrived before init() is called.) Still some possible race condition in DynamicPartitionPruner Key: HIVE-11515 URL: https://issues.apache.org/jira/browse/HIVE-11515 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-11515.1.patch.txt Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to reproduce but it seemed related to the fact that prune() is called by thread-pool. With some delay in queue, events from fast tasks are arrived before prune() is called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11506) Casting varchar/char type to string cannot be vectorized
[ https://issues.apache.org/jira/browse/HIVE-11506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11506: - Attachment: HIVE-11506.2.patch.txt Updated golden files Casting varchar/char type to string cannot be vectorized Key: HIVE-11506 URL: https://issues.apache.org/jira/browse/HIVE-11506 Project: Hive Issue Type: Improvement Components: Vectorization Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-11506.1.patch.txt, HIVE-11506.2.patch.txt It's not defined in vectorization context. {code} explain select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order by x; {code} Mapper is not vectorized by exception, {noformat} 015-08-10 17:02:08,003 INFO [main]: physical.Vectorizer (Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: varchar(10) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10931) Wrong columns selected on multiple joins
[ https://issues.apache.org/jira/browse/HIVE-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis resolved HIVE-10931. -- Resolution: Cannot Reproduce Feel free to open this if it happens again. Wrong columns selected on multiple joins Key: HIVE-10931 URL: https://issues.apache.org/jira/browse/HIVE-10931 Project: Hive Issue Type: Bug Affects Versions: 1.1.0 Environment: Cloudera cdh5.4.2 Reporter: Furcy Pin Fix For: 1.2.1 The following set of queries : {code:sql} DROP TABLE IF EXISTS test1 ; DROP TABLE IF EXISTS test2 ; DROP TABLE IF EXISTS test3 ; CREATE TABLE test1 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ; INSERT INTO TABLE test1 VALUES (1,NULL,NULL,NULL,NULL,A) ; CREATE TABLE test2 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ; INSERT INTO TABLE test2 VALUES (1,NULL,NULL,NULL,NULL,X) ; CREATE TABLE test3 (coL1 STRING) ; INSERT INTO TABLE test3 VALUES (A) ; SELECT T2.val FROM test1 T1 LEFT JOIN (SELECT col1, col2, col3, col4, col5, COALESCE(col6,) as val FROM test2) T2 ON T2.col1 = T1.col1 LEFT JOIN test3 T3 ON T3.col1 = T1.col6 ; {code} will return this : {noformat} +--+--+ | t2.val | +--+--+ | A| +--+--+ {noformat} Obviously, this result is wrong as table `test2` contains a X and no A. This is the most minimal example we found of this issue, in particular having less than 6 columns in the tables will work, for instance : {code:sql} SELECT T2.val FROM test1 T1 LEFT JOIN (SELECT col1, col2, col3, col4, COALESCE(col6,) as val FROM test2) T2 ON T2.col1 = T1.col1 LEFT JOIN test3 T3 ON T3.col1 = T1.col6 ; {code} (same query as before, but `col5` was removed from the select) will return : {noformat} +--+--+ | t2.val | +--+--+ | X| +--+--+ {noformat} Removing the `COALESCE` also removes the bug... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11176) aused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to [Ljava.lang.Object;
[ https://issues.apache.org/jira/browse/HIVE-11176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11176: - Attachment: HIVE-11176.1.patch.txt Trivial fix aused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to [Ljava.lang.Object; --- Key: HIVE-11176 URL: https://issues.apache.org/jira/browse/HIVE-11176 Project: Hive Issue Type: Bug Components: Hive, Tez Affects Versions: 1.0.0, 1.2.0 Environment: Hive 1.2 and TEz 0.7 Reporter: Soundararajan Velu Priority: Critical Attachments: HIVE-11176.1.patch.txt Unreachable code: hive/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardStructObjectInspector.java // With Data @Override @SuppressWarnings(unchecked) public Object getStructFieldData(Object data, StructField fieldRef) { if (data == null) { return null; } // We support both ListObject and Object[] // so we have to do differently. boolean isArray = ! (data instanceof List); if (!isArray !(data instanceof List)) { return data; } * The if condition above translates to if(!true true) the code section cannot be reached, this causes a lot of class cast exceptions while using Tez or ORC file formats or custom jsonsede, Strangely this happens only while using Tez. Changed the code to boolean isArray = data.getClass().isArray(); if (!isArray !(data instanceof List)) { return data; } Even then, lazystructs get passed as fields causing downstream cast exceptions like lazystruct cannot be cast to Text etc... So I changed the method to something like this, // With Data @Override @SuppressWarnings(unchecked) public Object getStructFieldData(Object data, StructField fieldRef) { if (data == null) { return null; } if (data instanceof LazyBinaryStruct) { data = ((LazyBinaryStruct) data).getFieldsAsList(); } // We support both ListObject and Object[] // so we have to do differently. boolean isArray = data.getClass().isArray(); if (!isArray !(data instanceof List)) { return data; } This is causing arrayindexout of bounds exception and other typecast exceptions in object inspectors, Please help, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11506) Casting varchar/char type to string cannot be vectorized
[ https://issues.apache.org/jira/browse/HIVE-11506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11506: - Attachment: HIVE-11506.1.patch.txt Casting varchar/char type to string cannot be vectorized Key: HIVE-11506 URL: https://issues.apache.org/jira/browse/HIVE-11506 Project: Hive Issue Type: Improvement Components: Vectorization Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-11506.1.patch.txt It's not defined in vectorization context. {code} explain select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order by x; {code} Mapper is not vectorized by exception, {noformat} 015-08-10 17:02:08,003 INFO [main]: physical.Vectorizer (Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: varchar(10) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11506) Casting varchar/char type to string cannot be vectorized
[ https://issues.apache.org/jira/browse/HIVE-11506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11506: - Description: It's not defined in vectorization context. {code} explain select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order by x; {code} Mapper is not vectorized by exception, {noformat} 015-08-10 17:02:08,003 INFO [main]: physical.Vectorizer (Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: varchar(10) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906) {noformat} was: It's not defined in vectorization context. {code} explain select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order by x; {code} Mapper {noformat} 015-08-10 17:02:08,003 INFO [main]: physical.Vectorizer (Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: varchar(10) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906) {noformat} Casting varchar/char type to string cannot be vectorized Key: HIVE-11506 URL: https://issues.apache.org/jira/browse/HIVE-11506 Project: Hive Issue Type: Improvement Components: Vectorization Reporter: Navis Assignee: Navis Priority: Trivial It's not defined in vectorization context. {code} explain select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order by x; {code} Mapper is not vectorized by exception, {noformat} 015-08-10 17:02:08,003 INFO [main]: physical.Vectorizer (Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: varchar(10) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner
[ https://issues.apache.org/jira/browse/HIVE-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710436#comment-14710436 ] Navis commented on HIVE-11515: -- [~sseth] If it's already fixed, seemed not need to commit this. Thanks! Still some possible race condition in DynamicPartitionPruner Key: HIVE-11515 URL: https://issues.apache.org/jira/browse/HIVE-11515 Project: Hive Issue Type: Bug Components: Query Processor, Tez Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-11515.1.patch.txt Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to reproduce but it seemed related to the fact that prune() is called by thread-pool. With some delay in queue, events from fast tasks are arrived before prune() is called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7575: Attachment: HIVE-7575.2.patch.txt > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971197#comment-14971197 ] Navis commented on HIVE-7575: - getTables() is the first call from most BI tools but it takes so much time with 100+ databases. I think it's worth to make a dedicated API in metastore for this. > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7575: Attachment: HIVE-7575.1.patch.txt > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri > Attachments: HIVE-7575.1.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7575: Attachment: HIVE-7575.3.patch.txt Fixed test fails > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, > HIVE-7575.3.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977451#comment-14977451 ] Navis commented on HIVE-7575: - [~ychena] Sorry for my bad description. Retrieving all the Table instances from metastore is the the root cause of this, Not databases. For my case, I have 3000+ tables in 100+ databases. [~szehon] I thought at first "table_types" was also a patten like other params but actually it was not. Would it be better to be String[] or List? [~aihuaxu] There are already some test cases (TestJdbcDriver2, for example) but I think I can add some more. Thanks. > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, > HIVE-7575.3.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7575: Attachment: HIVE-7575.4.patch.txt Addressed comments. Let's see test result. > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, > HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11702) GetSchemas thrift call is slow on scale of 1000+ databases
[ https://issues.apache.org/jira/browse/HIVE-11702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977861#comment-14977861 ] Navis commented on HIVE-11702: -- [~erickt] Added short path for getSchemas(null) in recent patch of HIVE-7575. Didn't confirmed the effect. > GetSchemas thrift call is slow on scale of 1000+ databases > -- > > Key: HIVE-11702 > URL: https://issues.apache.org/jira/browse/HIVE-11702 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.1 >Reporter: Jenny Kim > Attachments: HIVE-11702.1.patch.txt > > > Similar to https://issues.apache.org/jira/browse/HIVE-7575 GetSchemas also > starts to degrade in latency starting at the order of 1000+ databases, which > returned in about 30 seconds. > However, SHOW DATABASES on the same Hive instance returns within a few > seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7575: Attachment: HIVE-7575.5.patch.txt Added short path for getSchemas(null). see HIVE-11702 > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, > HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt, HIVE-7575.5.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11756) Avoid redundant key serialization in RS for distinct query
[ https://issues.apache.org/jira/browse/HIVE-11756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977866#comment-14977866 ] Navis commented on HIVE-11756: -- cannot reproduce fail of index_bitmap_auto. others seemed not related. > Avoid redundant key serialization in RS for distinct query > -- > > Key: HIVE-11756 > URL: https://issues.apache.org/jira/browse/HIVE-11756 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11756.1.patch.txt, HIVE-11756.2.patch.txt, > HIVE-11756.3.patch.txt, HIVE-11756.4.patch.txt > > > Currently hive serializes twice to know the length of distribution key for > distinct queries. This introduces IndexedSerializer to avoid this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7575: Attachment: HIVE-7575.4.patch.txt > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, > HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7575: Attachment: (was: HIVE-7575.4.patch.txt) > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, > HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7575: Attachment: HIVE-7575.4.patch.txt > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, > HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7575: Attachment: (was: HIVE-7575.4.patch.txt) > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, > HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7575: Attachment: HIVE-7575.6.patch.txt Rebased to trunk & addressed comment > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, > HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt, HIVE-7575.5.patch.txt, > HIVE-7575.6.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979848#comment-14979848 ] Navis commented on HIVE-7575: - [~aihuaxu] Wanted method signature to be simplistic but so be it. (used TableMetaData by mistake. I'll change it to TableMeta in next patch) bq. To Yongzhi's question: when we have many databases, the performance of the original getTables could be bad since we are making at least one trip for each database. Is that right? It would be one of the root cause. But seeing HIVE-11702, it takes much time though getSchema(null) uses just one call to metastore. Pattern matching query seems much more expensive than expected(Even with simple * pattern) > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, > HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt, HIVE-7575.5.patch.txt, > HIVE-7575.6.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12388) GetTables cannot get external tables when TABLE type argument is given
[ https://issues.apache.org/jira/browse/HIVE-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-12388: - Attachment: HIVE-12388.1.patch.txt > GetTables cannot get external tables when TABLE type argument is given > -- > > Key: HIVE-12388 > URL: https://issues.apache.org/jira/browse/HIVE-12388 > Project: Hive > Issue Type: Bug > Components: JDBC >Reporter: Navis >Assignee: Navis >Priority: Critical > Attachments: HIVE-12388.1.patch.txt > > > By regression of HIVE-7575, external tables are not shown when "TABLE" type > is specified as argument. I'm working on this. Sorry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12373) Interner should return identical map or list
[ https://issues.apache.org/jira/browse/HIVE-12373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-12373: - Attachment: HIVE-12373.1.patch.txt > Interner should return identical map or list > > > Key: HIVE-12373 > URL: https://issues.apache.org/jira/browse/HIVE-12373 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: HIVE-12373.1.patch.txt > > > Currently, HiveStringUtils.intern(map/list) returns new instance of map or > list. But it would break some usage style of code something like below (it's > spark code in HiveMetastoreCatalog) > {code} > val serdeParameters = new java.util.HashMap[String, String]() > serdeInfo.setParameters(serdeParameters) > // these properties will be gone > table.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) } > p.storage.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) } > {code} > Luckily for spark, interner was not applied to released version of hive > (1.2.0, 1.2.1) by mistake. But it would make problem in someday. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow
[ https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7575: Attachment: HIVE-7575.7.patch.txt Rebased to trunk > GetTables thrift call is very slow > -- > > Key: HIVE-7575 > URL: https://issues.apache.org/jira/browse/HIVE-7575 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.12.0, 0.13.0 >Reporter: Ashu Pachauri >Assignee: Navis > Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, > HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt, HIVE-7575.5.patch.txt, > HIVE-7575.6.patch.txt, HIVE-7575.7.patch.txt > > > The GetTables thrift call takes a long time when the number of table is large. > With around 5000 tables, the call takes around 80 seconds compared to a "Show > Tables" query on the same HiveServer2 instance which takes 3-7 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments
[ https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-12182: - Attachment: HIVE-12182.1.patch.txt > ALTER TABLE PARTITION COLUMN does not set partition column comments > --- > > Key: HIVE-12182 > URL: https://issues.apache.org/jira/browse/HIVE-12182 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.1 >Reporter: Lenni Kuff >Assignee: Navis > Attachments: HIVE-12182.1.patch.txt > > > ALTER TABLE PARTITION COLUMN does not set partition column comments. The > syntax is accepted, but the COMMENT for the column is ignored. > {code} > 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment > 'HELLO') partitioned by (j int comment 'WORLD'); > No rows affected (0.104 seconds) > 0: jdbc:hive2://localhost:1/default> describe part_test; > +--+---+---+--+ > | col_name | data_type |comment| > +--+---+---+--+ > | i| int | HELLO | > | j| int | WORLD | > | | NULL | NULL | > | # Partition Information | NULL | NULL | > | # col_name | data_type | comment | > | | NULL | NULL | > | j| int | WORLD | > +--+---+---+--+ > 7 rows selected (0.109 seconds) > 0: jdbc:hive2://localhost:1/default> alter table part_test partition > column (j int comment 'WIDE'); > No rows affected (0.121 seconds) > 0: jdbc:hive2://localhost:1/default> describe part_test; > +--+---+---+--+ > | col_name | data_type |comment| > +--+---+---+--+ > | i| int | HELLO | > | j| int | | > | | NULL | NULL | > | # Partition Information | NULL | NULL | > | # col_name | data_type | comment | > | | NULL | NULL | > | j| int | | > +--+---+---+--+ > 7 rows selected (0.108 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments
[ https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis reassigned HIVE-12182: Assignee: Navis (was: Naveen Gangam) > ALTER TABLE PARTITION COLUMN does not set partition column comments > --- > > Key: HIVE-12182 > URL: https://issues.apache.org/jira/browse/HIVE-12182 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.1 >Reporter: Lenni Kuff >Assignee: Navis > Attachments: HIVE-12182.1.patch.txt > > > ALTER TABLE PARTITION COLUMN does not set partition column comments. The > syntax is accepted, but the COMMENT for the column is ignored. > {code} > 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment > 'HELLO') partitioned by (j int comment 'WORLD'); > No rows affected (0.104 seconds) > 0: jdbc:hive2://localhost:1/default> describe part_test; > +--+---+---+--+ > | col_name | data_type |comment| > +--+---+---+--+ > | i| int | HELLO | > | j| int | WORLD | > | | NULL | NULL | > | # Partition Information | NULL | NULL | > | # col_name | data_type | comment | > | | NULL | NULL | > | j| int | WORLD | > +--+---+---+--+ > 7 rows selected (0.109 seconds) > 0: jdbc:hive2://localhost:1/default> alter table part_test partition > column (j int comment 'WIDE'); > No rows affected (0.121 seconds) > 0: jdbc:hive2://localhost:1/default> describe part_test; > +--+---+---+--+ > | col_name | data_type |comment| > +--+---+---+--+ > | i| int | HELLO | > | j| int | | > | | NULL | NULL | > | # Partition Information | NULL | NULL | > | # col_name | data_type | comment | > | | NULL | NULL | > | j| int | | > +--+---+---+--+ > 7 rows selected (0.108 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments
[ https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-12182: - Assignee: Naveen Gangam (was: Navis) > ALTER TABLE PARTITION COLUMN does not set partition column comments > --- > > Key: HIVE-12182 > URL: https://issues.apache.org/jira/browse/HIVE-12182 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.1 >Reporter: Lenni Kuff >Assignee: Naveen Gangam > > ALTER TABLE PARTITION COLUMN does not set partition column comments. The > syntax is accepted, but the COMMENT for the column is ignored. > {code} > 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment > 'HELLO') partitioned by (j int comment 'WORLD'); > No rows affected (0.104 seconds) > 0: jdbc:hive2://localhost:1/default> describe part_test; > +--+---+---+--+ > | col_name | data_type |comment| > +--+---+---+--+ > | i| int | HELLO | > | j| int | WORLD | > | | NULL | NULL | > | # Partition Information | NULL | NULL | > | # col_name | data_type | comment | > | | NULL | NULL | > | j| int | WORLD | > +--+---+---+--+ > 7 rows selected (0.109 seconds) > 0: jdbc:hive2://localhost:1/default> alter table part_test partition > column (j int comment 'WIDE'); > No rows affected (0.121 seconds) > 0: jdbc:hive2://localhost:1/default> describe part_test; > +--+---+---+--+ > | col_name | data_type |comment| > +--+---+---+--+ > | i| int | HELLO | > | j| int | | > | | NULL | NULL | > | # Partition Information | NULL | NULL | > | # col_name | data_type | comment | > | | NULL | NULL | > | j| int | | > +--+---+---+--+ > 7 rows selected (0.108 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments
[ https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-12182: - Attachment: (was: HIVE-12182.1.patch.txt) > ALTER TABLE PARTITION COLUMN does not set partition column comments > --- > > Key: HIVE-12182 > URL: https://issues.apache.org/jira/browse/HIVE-12182 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.1 >Reporter: Lenni Kuff >Assignee: Naveen Gangam > > ALTER TABLE PARTITION COLUMN does not set partition column comments. The > syntax is accepted, but the COMMENT for the column is ignored. > {code} > 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment > 'HELLO') partitioned by (j int comment 'WORLD'); > No rows affected (0.104 seconds) > 0: jdbc:hive2://localhost:1/default> describe part_test; > +--+---+---+--+ > | col_name | data_type |comment| > +--+---+---+--+ > | i| int | HELLO | > | j| int | WORLD | > | | NULL | NULL | > | # Partition Information | NULL | NULL | > | # col_name | data_type | comment | > | | NULL | NULL | > | j| int | WORLD | > +--+---+---+--+ > 7 rows selected (0.109 seconds) > 0: jdbc:hive2://localhost:1/default> alter table part_test partition > column (j int comment 'WIDE'); > No rows affected (0.121 seconds) > 0: jdbc:hive2://localhost:1/default> describe part_test; > +--+---+---+--+ > | col_name | data_type |comment| > +--+---+---+--+ > | i| int | HELLO | > | j| int | | > | | NULL | NULL | > | # Partition Information | NULL | NULL | > | # col_name | data_type | comment | > | | NULL | NULL | > | j| int | | > +--+---+---+--+ > 7 rows selected (0.108 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
[ https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11768: - Attachment: HIVE-11768.6.patch.txt > java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances > > > Key: HIVE-11768 > URL: https://issues.apache.org/jira/browse/HIVE-11768 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Navis >Priority: Minor > Fix For: 2.0.0 > > Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt, > HIVE-11768.3.patch.txt, HIVE-11768.4.patch.txt, HIVE-11768.5.patch.txt, > HIVE-11768.6.patch.txt > > > More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our > long running HiveServer2 instances,taken up more than 100MB on heap. > Most of the paths contains a suffix of ".pipeout". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11518) Provide interface to adjust required resource for tez tasks
[ https://issues.apache.org/jira/browse/HIVE-11518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11518: - Attachment: HIVE-11518.2.patch.txt Rebased to trunk > Provide interface to adjust required resource for tez tasks > --- > > Key: HIVE-11518 > URL: https://issues.apache.org/jira/browse/HIVE-11518 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: HIVE-11518.1.patch.txt, HIVE-11518.2.patch.txt > > > Resource requirements for each tasks are varied but currently it's fixed to > one value(via hive.tez.container.size). It would be good to customize > resource requirements appropriate to expected work. > Suggested interface is quite simple. > {code} > public interface ResourceCalculator { > Resource adjust(Resource resource, MapWork mapWork); > Resource adjust(Resource resource, ReduceWork reduceWork); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
[ https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11768: - Attachment: HIVE-11768.4.patch.txt > java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances > > > Key: HIVE-11768 > URL: https://issues.apache.org/jira/browse/HIVE-11768 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Navis >Priority: Minor > Fix For: 2.0.0 > > Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt, > HIVE-11768.3.patch.txt, HIVE-11768.4.patch.txt > > > More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our > long running HiveServer2 instances,taken up more than 100MB on heap. > Most of the paths contains a suffix of ".pipeout". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11679) SemanticAnalysis of "a=1" can result in a new Configuration() object
[ https://issues.apache.org/jira/browse/HIVE-11679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954210#comment-14954210 ] Navis commented on HIVE-11679: -- [~ashutoshc] created RB but testNegativeCliDriver_compare_*_bigint seemed related. I'll look into this. > SemanticAnalysis of "a=1" can result in a new Configuration() object > > > Key: HIVE-11679 > URL: https://issues.apache.org/jira/browse/HIVE-11679 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Navis > Attachments: HIVE-11679.1.patch.txt, HIVE-11679.2.patch.txt > > > {code} > public static ExprNodeGenericFuncDesc newInstance(GenericUDF genericUDF, > String funcText, > List children) throws UDFArgumentException { > ... > if (genericUDF instanceof GenericUDFBaseCompare && children.size() == 2) { > TypeInfo oiTypeInfo0 = children.get(0).getTypeInfo(); > TypeInfo oiTypeInfo1 = children.get(1).getTypeInfo(); > SessionState ss = SessionState.get(); > Configuration conf = (ss != null) ? ss.getConf() : new Configuration(); > {code} > This is both a SessionState.get() which is a threadlocal lookup or worse, a > new Configuration() which means XML parsing of multiple files for each > equality expression in the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10890) Provide implementable engine selector
[ https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-10890: - Attachment: HIVE-10890.3.patch.txt > Provide implementable engine selector > - > > Key: HIVE-10890 > URL: https://issues.apache.org/jira/browse/HIVE-10890 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-10890.1.patch.txt, HIVE-10890.2.patch.txt, > HIVE-10890.3.patch.txt > > > Now hive supports three kind of engines. It would be good to have an > automatic engine selector without setting explicitly engine for execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11679) SemanticAnalysis of "a=1" can result in a new Configuration() object
[ https://issues.apache.org/jira/browse/HIVE-11679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11679: - Attachment: HIVE-11679.2.patch.txt > SemanticAnalysis of "a=1" can result in a new Configuration() object > > > Key: HIVE-11679 > URL: https://issues.apache.org/jira/browse/HIVE-11679 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Navis > Attachments: HIVE-11679.1.patch.txt, HIVE-11679.2.patch.txt > > > {code} > public static ExprNodeGenericFuncDesc newInstance(GenericUDF genericUDF, > String funcText, > List children) throws UDFArgumentException { > ... > if (genericUDF instanceof GenericUDFBaseCompare && children.size() == 2) { > TypeInfo oiTypeInfo0 = children.get(0).getTypeInfo(); > TypeInfo oiTypeInfo1 = children.get(1).getTypeInfo(); > SessionState ss = SessionState.get(); > Configuration conf = (ss != null) ? ss.getConf() : new Configuration(); > {code} > This is both a SessionState.get() which is a threadlocal lookup or worse, a > new Configuration() which means XML parsing of multiple files for each > equality expression in the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
[ https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11768: - Attachment: HIVE-11768.5.patch.txt Addressed comments & fixed test fail > java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances > > > Key: HIVE-11768 > URL: https://issues.apache.org/jira/browse/HIVE-11768 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Navis >Priority: Minor > Fix For: 2.0.0 > > Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt, > HIVE-11768.3.patch.txt, HIVE-11768.4.patch.txt, HIVE-11768.5.patch.txt > > > More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our > long running HiveServer2 instances,taken up more than 100MB on heap. > Most of the paths contains a suffix of ".pipeout". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
[ https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956063#comment-14956063 ] Navis commented on HIVE-11768: -- [~thejas] I've left "FileSystem.deleteOnExit" problem for another issue because "FileSystem.close" will not be called ever. Could it be removed from code safely? I'm not sure on that. > java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances > > > Key: HIVE-11768 > URL: https://issues.apache.org/jira/browse/HIVE-11768 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Navis >Priority: Minor > Fix For: 2.0.0 > > Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt, > HIVE-11768.3.patch.txt, HIVE-11768.4.patch.txt, HIVE-11768.5.patch.txt > > > More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our > long running HiveServer2 instances,taken up more than 100MB on heap. > Most of the paths contains a suffix of ".pipeout". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11679) SemanticAnalysis of "a=1" can result in a new Configuration() object
[ https://issues.apache.org/jira/browse/HIVE-11679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956079#comment-14956079 ] Navis commented on HIVE-11679: -- Strange.. I cannot reproduce fail of udaf_histogram_numeric > SemanticAnalysis of "a=1" can result in a new Configuration() object > > > Key: HIVE-11679 > URL: https://issues.apache.org/jira/browse/HIVE-11679 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Navis > Attachments: HIVE-11679.1.patch.txt, HIVE-11679.2.patch.txt, > HIVE-11679.3.patch.txt > > > {code} > public static ExprNodeGenericFuncDesc newInstance(GenericUDF genericUDF, > String funcText, > List children) throws UDFArgumentException { > ... > if (genericUDF instanceof GenericUDFBaseCompare && children.size() == 2) { > TypeInfo oiTypeInfo0 = children.get(0).getTypeInfo(); > TypeInfo oiTypeInfo1 = children.get(1).getTypeInfo(); > SessionState ss = SessionState.get(); > Configuration conf = (ss != null) ? ss.getConf() : new Configuration(); > {code} > This is both a SessionState.get() which is a threadlocal lookup or worse, a > new Configuration() which means XML parsing of multiple files for each > equality expression in the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11679) SemanticAnalysis of "a=1" can result in a new Configuration() object
[ https://issues.apache.org/jira/browse/HIVE-11679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11679: - Attachment: HIVE-11679.3.patch.txt Just updated negative test results. > SemanticAnalysis of "a=1" can result in a new Configuration() object > > > Key: HIVE-11679 > URL: https://issues.apache.org/jira/browse/HIVE-11679 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Navis > Attachments: HIVE-11679.1.patch.txt, HIVE-11679.2.patch.txt, > HIVE-11679.3.patch.txt > > > {code} > public static ExprNodeGenericFuncDesc newInstance(GenericUDF genericUDF, > String funcText, > List children) throws UDFArgumentException { > ... > if (genericUDF instanceof GenericUDFBaseCompare && children.size() == 2) { > TypeInfo oiTypeInfo0 = children.get(0).getTypeInfo(); > TypeInfo oiTypeInfo1 = children.get(1).getTypeInfo(); > SessionState ss = SessionState.get(); > Configuration conf = (ss != null) ? ss.getConf() : new Configuration(); > {code} > This is both a SessionState.get() which is a threadlocal lookup or worse, a > new Configuration() which means XML parsing of multiple files for each > equality expression in the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11662) DP cannot be applied to external table which contains part-spec like directory
[ https://issues.apache.org/jira/browse/HIVE-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723042#comment-14723042 ] Navis commented on HIVE-11662: -- Failures seemed not related to this. I'll add some test cases. > DP cannot be applied to external table which contains part-spec like directory > -- > > Key: HIVE-11662 > URL: https://issues.apache.org/jira/browse/HIVE-11662 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11662.1.patch.txt > > > Some users want to use part-spec like directory name in their partitioned > table locations, something like, > {noformat} > /something/warehouse/some_key=some_value > {noformat} > DP calculates additional partitions from full path, and makes exception > something like, > {noformat} > Failed with exception Partition spec {some_key=some_value, > part_key=part_value} contains non-partition columns > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11662) DP cannot be applied to external table which contains part-spec like directory
[ https://issues.apache.org/jira/browse/HIVE-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11662: - Attachment: HIVE-11662.1.patch.txt For preliminary test DP cannot be applied to external table which contains part-spec like directory -- Key: HIVE-11662 URL: https://issues.apache.org/jira/browse/HIVE-11662 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-11662.1.patch.txt Some users want to use part-spec like directory name in their partitioned table locations, something like, {noformat} /something/warehouse/some_key=some_value {noformat} DP calculates additional partitions from full path, and makes exception something like, {noformat} Failed with exception Partition spec {some_key=some_value, part_key=part_value} contains non-partition columns FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11662) DP cannot be applied to external table which contains part-spec like directory
[ https://issues.apache.org/jira/browse/HIVE-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11662: - Attachment: HIVE-11662.2.patch.txt > DP cannot be applied to external table which contains part-spec like directory > -- > > Key: HIVE-11662 > URL: https://issues.apache.org/jira/browse/HIVE-11662 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11662.1.patch.txt, HIVE-11662.2.patch.txt > > > Some users want to use part-spec like directory name in their partitioned > table locations, something like, > {noformat} > /something/warehouse/some_key=some_value > {noformat} > DP calculates additional partitions from full path, and makes exception > something like, > {noformat} > Failed with exception Partition spec {some_key=some_value, > part_key=part_value} contains non-partition columns > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11706) Implement "show create database"
[ https://issues.apache.org/jira/browse/HIVE-11706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11706: - Attachment: HIVE-11706.1.patch.txt > Implement "show create database" > > > Key: HIVE-11706 > URL: https://issues.apache.org/jira/browse/HIVE-11706 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11706.1.patch.txt > > > HIVE-967 introduced "show create table". How about "show create database"? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11657) HIVE-2573 introduces some issues during metastore init (and CLI init)
[ https://issues.apache.org/jira/browse/HIVE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725024#comment-14725024 ] Navis commented on HIVE-11657: -- [~sershe] Making static call in Hive was one of the my worst decision on hive. I was so tired of rebasing the patch for years and new things like permanent function bothered too much of me and I couldn't think of any better idea in that day. > HIVE-2573 introduces some issues during metastore init (and CLI init) > - > > Key: HIVE-11657 > URL: https://issues.apache.org/jira/browse/HIVE-11657 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Critical > Attachments: HIVE-11657.patch > > > HIVE-2573 introduced static reload functions call. > It has a few problems: > 1) When metastore client is initialized using an externally supplied config > (i.e. Hive.get(HiveConf)), it still gets called during static init using the > main service config. In my case, even though I have uris in the supplied > config to connect to remote MS (which eventually happens), the static call > creates objectstore, which is undesirable. > 2) It breaks compat - old metastores do not support this call so new clients > will fail, and there's no workaround like not using a new feature because the > static call is always made -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11706) Implement "show create database"
[ https://issues.apache.org/jira/browse/HIVE-11706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11706: - Attachment: HIVE-11706.2.patch.txt Fixed test fails > Implement "show create database" > > > Key: HIVE-11706 > URL: https://issues.apache.org/jira/browse/HIVE-11706 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11706.1.patch.txt, HIVE-11706.2.patch.txt > > > HIVE-967 introduced "show create table". How about "show create database"? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11662) Dynamic partitioning cannot be applied to external table which contains part-spec like directory name
[ https://issues.apache.org/jira/browse/HIVE-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11662: - Summary: Dynamic partitioning cannot be applied to external table which contains part-spec like directory name (was: DP cannot be applied to external table which contains part-spec like directory) > Dynamic partitioning cannot be applied to external table which contains > part-spec like directory name > - > > Key: HIVE-11662 > URL: https://issues.apache.org/jira/browse/HIVE-11662 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11662.1.patch.txt, HIVE-11662.2.patch.txt > > > Some users want to use part-spec like directory name in their partitioned > table locations, something like, > {noformat} > /something/warehouse/some_key=some_value > {noformat} > DP calculates additional partitions from full path, and makes exception > something like, > {noformat} > Failed with exception Partition spec {some_key=some_value, > part_key=part_value} contains non-partition columns > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11662) DP cannot be applied to external table which contains part-spec like directory
[ https://issues.apache.org/jira/browse/HIVE-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734152#comment-14734152 ] Navis commented on HIVE-11662: -- [~leftylev] Right. I'll rename the issue description. > DP cannot be applied to external table which contains part-spec like directory > -- > > Key: HIVE-11662 > URL: https://issues.apache.org/jira/browse/HIVE-11662 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11662.1.patch.txt, HIVE-11662.2.patch.txt > > > Some users want to use part-spec like directory name in their partitioned > table locations, something like, > {noformat} > /something/warehouse/some_key=some_value > {noformat} > DP calculates additional partitions from full path, and makes exception > something like, > {noformat} > Failed with exception Partition spec {some_key=some_value, > part_key=part_value} contains non-partition columns > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11752) Pre-materializing complex CTE queries
[ https://issues.apache.org/jira/browse/HIVE-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11752: - Attachment: HIVE-11752.1.patch.txt Patch for preliminary test. > Pre-materializing complex CTE queries > - > > Key: HIVE-11752 > URL: https://issues.apache.org/jira/browse/HIVE-11752 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: HIVE-11752.1.patch.txt > > > Currently, hive regards CTE clauses as a simple alias to the query block, > which makes redundant works if it's used multiple times in a query. This > introduces a reference threshold for pre-materializing the CTE clause as a > volatile table (which is not exists in any form of metastore and just > accessible from QB). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11754) Not reachable code parts in StatsUtils
[ https://issues.apache.org/jira/browse/HIVE-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11754: - Attachment: HIVE-11754.1.patch.txt > Not reachable code parts in StatsUtils > -- > > Key: HIVE-11754 > URL: https://issues.apache.org/jira/browse/HIVE-11754 > Project: Hive > Issue Type: Task >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11754.1.patch.txt > > > No need to check "oi instanceof WritableConstantHiveCharObjectInspector" > after "oi instanceof ConstantObjectInspector". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10890) Provide implementable engine selector
[ https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-10890: - Attachment: HIVE-10890.2.patch.txt rebased to trunk > Provide implementable engine selector > - > > Key: HIVE-10890 > URL: https://issues.apache.org/jira/browse/HIVE-10890 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-10890.1.patch.txt, HIVE-10890.2.patch.txt > > > Now hive supports three kind of engines. It would be good to have an > automatic engine selector without setting explicitly engine for execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11756) Avoid redundant key serialization in RS for distinct query
[ https://issues.apache.org/jira/browse/HIVE-11756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11756: - Attachment: HIVE-11756.1.patch.txt Attaching patch for preliminary test > Avoid redundant key serialization in RS for distinct query > -- > > Key: HIVE-11756 > URL: https://issues.apache.org/jira/browse/HIVE-11756 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11756.1.patch.txt > > > Currently hive serializes twice to know the length of distribution key for > distinct queries. This introduces IndexedSerializer to avoid this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11756) Avoid redundant key serialization in RS for distinct query
[ https://issues.apache.org/jira/browse/HIVE-11756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11756: - Attachment: HIVE-11756.2.patch.txt > Avoid redundant key serialization in RS for distinct query > -- > > Key: HIVE-11756 > URL: https://issues.apache.org/jira/browse/HIVE-11756 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11756.1.patch.txt, HIVE-11756.2.patch.txt > > > Currently hive serializes twice to know the length of distribution key for > distinct queries. This introduces IndexedSerializer to avoid this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11754) Not reachable code parts in StatsUtils
[ https://issues.apache.org/jira/browse/HIVE-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11754: - Attachment: HIVE-11754.2.patch.txt > Not reachable code parts in StatsUtils > -- > > Key: HIVE-11754 > URL: https://issues.apache.org/jira/browse/HIVE-11754 > Project: Hive > Issue Type: Task >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11754.1.patch.txt, HIVE-11754.2.patch.txt > > > No need to check "oi instanceof WritableConstantHiveCharObjectInspector" > after "oi instanceof ConstantObjectInspector". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
[ https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11768: - Attachment: HIVE-11768.1.patch.txt > java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances > > > Key: HIVE-11768 > URL: https://issues.apache.org/jira/browse/HIVE-11768 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.2.1 >Reporter: Nemon Lou > Attachments: HIVE-11768.1.patch.txt > > > More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our > long running HiveServer2 instances,taken up more than 100MB on heap. > Most of the paths contains a suffix of ".piepout". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
[ https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis reassigned HIVE-11768: Assignee: Navis > java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances > > > Key: HIVE-11768 > URL: https://issues.apache.org/jira/browse/HIVE-11768 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Navis > Attachments: HIVE-11768.1.patch.txt > > > More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our > long running HiveServer2 instances,taken up more than 100MB on heap. > Most of the paths contains a suffix of ".piepout". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11756) Avoid redundant key serialization in RS for distinct query
[ https://issues.apache.org/jira/browse/HIVE-11756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11756: - Attachment: HIVE-11756.3.patch.txt Fixed test fails > Avoid redundant key serialization in RS for distinct query > -- > > Key: HIVE-11756 > URL: https://issues.apache.org/jira/browse/HIVE-11756 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11756.1.patch.txt, HIVE-11756.2.patch.txt, > HIVE-11756.3.patch.txt > > > Currently hive serializes twice to know the length of distribution key for > distinct queries. This introduces IndexedSerializer to avoid this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11774) Show macro definition for desc function
[ https://issues.apache.org/jira/browse/HIVE-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14740230#comment-14740230 ] Navis commented on HIVE-11774: -- [~damien.carol] it seemed not. {noformat} descStatement @init { pushMsg("describe statement", state); } @after { popMsg(state); } : (KW_DESCRIBE|KW_DESC) ( (KW_DATABASE|KW_SCHEMA) => (KW_DATABASE|KW_SCHEMA) KW_EXTENDED? (dbName=identifier) -> ^(TOK_DESCDATABASE $dbName KW_EXTENDED?) | (KW_FUNCTION) => KW_FUNCTION KW_EXTENDED? (name=descFuncNames) -> ^(TOK_DESCFUNCTION $name KW_EXTENDED?) | (KW_FORMATTED|KW_EXTENDED|KW_PRETTY) => ((descOptions=KW_FORMATTED|descOptions=KW_EXTENDED|descOptions=KW_PRETTY) parttype=partTypeExpr) -> ^(TOK_DESCTABLE $parttype $descOptions) | parttype=partTypeExpr -> ^(TOK_DESCTABLE $parttype) ) ; {noformat} Possibly support KW_MACRO, too. > Show macro definition for desc function > > > Key: HIVE-11774 > URL: https://issues.apache.org/jira/browse/HIVE-11774 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11774.1.patch.txt, HIVE-11774.2.patch.txt > > > Currently, desc function shows nothing for macro. It would be helpful if it > shows the definition of it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11774) Show macro definition for desc function
[ https://issues.apache.org/jira/browse/HIVE-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11774: - Attachment: HIVE-11774.2.patch.txt Fixed test fails > Show macro definition for desc function > > > Key: HIVE-11774 > URL: https://issues.apache.org/jira/browse/HIVE-11774 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11774.1.patch.txt, HIVE-11774.2.patch.txt > > > Currently, desc function shows nothing for macro. It would be helpful if it > shows the definition of it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
[ https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11768: - Attachment: HIVE-11768.2.patch.txt changed to synchronized set & minimized diff > java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances > > > Key: HIVE-11768 > URL: https://issues.apache.org/jira/browse/HIVE-11768 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Navis > Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt > > > More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our > long running HiveServer2 instances,taken up more than 100MB on heap. > Most of the paths contains a suffix of ".pipeout". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
[ https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14740251#comment-14740251 ] Navis commented on HIVE-11768: -- [~nemon] Thank for the report! > java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances > > > Key: HIVE-11768 > URL: https://issues.apache.org/jira/browse/HIVE-11768 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.2.1 >Reporter: Nemon Lou >Assignee: Navis > Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt > > > More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our > long running HiveServer2 instances,taken up more than 100MB on heap. > Most of the paths contains a suffix of ".pipeout". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11679) SemanticAnalysis of "a=1" can result in a new Configuration() object
[ https://issues.apache.org/jira/browse/HIVE-11679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11679: - Attachment: HIVE-11679.1.patch.txt Attaching patch for preliminary test > SemanticAnalysis of "a=1" can result in a new Configuration() object > > > Key: HIVE-11679 > URL: https://issues.apache.org/jira/browse/HIVE-11679 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V > Attachments: HIVE-11679.1.patch.txt > > > {code} > public static ExprNodeGenericFuncDesc newInstance(GenericUDF genericUDF, > String funcText, > List children) throws UDFArgumentException { > ... > if (genericUDF instanceof GenericUDFBaseCompare && children.size() == 2) { > TypeInfo oiTypeInfo0 = children.get(0).getTypeInfo(); > TypeInfo oiTypeInfo1 = children.get(1).getTypeInfo(); > SessionState ss = SessionState.get(); > Configuration conf = (ss != null) ? ss.getConf() : new Configuration(); > {code} > This is both a SessionState.get() which is a threadlocal lookup or worse, a > new Configuration() which means XML parsing of multiple files for each > equality expression in the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11752) Pre-materializing complex CTE queries
[ https://issues.apache.org/jira/browse/HIVE-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11752: - Attachment: HIVE-11752.2.patch.txt Fixed missing read/write entities > Pre-materializing complex CTE queries > - > > Key: HIVE-11752 > URL: https://issues.apache.org/jira/browse/HIVE-11752 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: HIVE-11752.1.patch.txt, HIVE-11752.2.patch.txt > > > Currently, hive regards CTE clauses as a simple alias to the query block, > which makes redundant works if it's used multiple times in a query. This > introduces a reference threshold for pre-materializing the CTE clause as a > volatile table (which is not exists in any form of metastore and just > accessible from QB). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11706) Implement "show create database"
[ https://issues.apache.org/jira/browse/HIVE-11706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11706: - Attachment: HIVE-11706.3.patch.txt Fixed test fails > Implement "show create database" > > > Key: HIVE-11706 > URL: https://issues.apache.org/jira/browse/HIVE-11706 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11706.1.patch.txt, HIVE-11706.2.patch.txt, > HIVE-11706.3.patch.txt > > > HIVE-967 introduced "show create table". How about "show create database"? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11752) Pre-materializing complex CTE queries
[ https://issues.apache.org/jira/browse/HIVE-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-11752: - Assignee: Jesus Camacho Rodriguez (was: Navis) > Pre-materializing complex CTE queries > - > > Key: HIVE-11752 > URL: https://issues.apache.org/jira/browse/HIVE-11752 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 2.1.0 >Reporter: Navis >Assignee: Jesus Camacho Rodriguez >Priority: Minor > Attachments: HIVE-11752.03.patch, HIVE-11752.04.patch, > HIVE-11752.1.patch.txt, HIVE-11752.2.patch.txt > > > Currently, hive regards CTE clauses as a simple alias to the query block, > which makes redundant works if it's used multiple times in a query. This > introduces a reference threshold for pre-materializing the CTE clause as a > volatile table (which is not exists in any form of metastore and just > accessible from QB). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11752) Pre-materializing complex CTE queries
[ https://issues.apache.org/jira/browse/HIVE-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127375#comment-15127375 ] Navis commented on HIVE-11752: -- [~jcamachorodriguez] Good to hear that someone got interested in this. It might be views that make me hard to complete this but I couldn't remember exact reason. Wishfully you can finish this to trunk because this can be major speed-up factor for complex DW queries. > Pre-materializing complex CTE queries > - > > Key: HIVE-11752 > URL: https://issues.apache.org/jira/browse/HIVE-11752 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 2.1.0 >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: HIVE-11752.03.patch, HIVE-11752.04.patch, > HIVE-11752.1.patch.txt, HIVE-11752.2.patch.txt > > > Currently, hive regards CTE clauses as a simple alias to the query block, > which makes redundant works if it's used multiple times in a query. This > introduces a reference threshold for pre-materializing the CTE clause as a > volatile table (which is not exists in any form of metastore and just > accessible from QB). -- This message was sent by Atlassian JIRA (v6.3.4#6332)