[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance

2015-02-23 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9495:

Attachment: HIVE-9495.2.patch.txt

Best effort to estimate ndv for hash aggregation

 Map Side aggregation affecting map performance
 --

 Key: HIVE-9495
 URL: https://issues.apache.org/jira/browse/HIVE-9495
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
 Environment: RHEL 6.4
 Hortonworks Hadoop 2.2
Reporter: Anand Sridharan
 Attachments: HIVE-9495.1.patch.txt, HIVE-9495.2.patch.txt, 
 profiler_screenshot.PNG


 When trying to run a simple aggregation query with hive.map.aggr=true, map 
 tasks take a lot of time in Hive 0.14 as against  with hive.map.aggr=false.
 e.g.
 Consider the query:
 {code}
 INSERT OVERWRITE TABLE lineitem_tgt_agg
 select alias.a0 as a0,
  alias.a2 as a1,
  alias.a1 as a2,
  alias.a3 as a3,
  alias.a4 as a4
 from (
  select alias.a0 as a0,
   SUM(alias.a1) as a1,
   SUM(alias.a2) as a2,
   SUM(alias.a3) as a3,
   SUM(alias.a4) as a4
  from (
   select lineitem_sf500.l_orderkey as a0,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - 
 lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1,
lineitem_sf500.l_quantity as a2,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
 lineitem_sf500.l_discount as double) as a3,
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
 lineitem_sf500.l_tax as double) as a4
   from lineitem_sf500
   ) alias
  group by alias.a0
  ) alias;
 {code}
 The above query was run with ~376GB of data / ~3billion records in the source.
 It takes ~10 minutes with hive.map.aggr=false.
 With map side aggregation set to true, the map tasks don't complete even 
 after an hour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10816) NPE in ExecDriver::handleSampling when submitted via child JVM

2015-06-09 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580013#comment-14580013
 ] 

Navis commented on HIVE-10816:
--

[~lirui] I don't know why I've not been notified but here is my late +1

 NPE in ExecDriver::handleSampling when submitted via child JVM
 --

 Key: HIVE-10816
 URL: https://issues.apache.org/jira/browse/HIVE-10816
 Project: Hive
  Issue Type: Bug
Reporter: Rui Li
Assignee: Rui Li
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-10816.1.patch, HIVE-10816.1.patch


 When {{hive.exec.submitviachild = true}}, parallel order by fails with NPE 
 and falls back to single-reducer mode. Stack trace:
 {noformat}
 2015-05-25 08:41:04,446 ERROR [main]: mr.ExecDriver 
 (ExecDriver.java:execute(386)) - Sampling error
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecDriver.handleSampling(ExecDriver.java:513)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:379)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:750)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:497)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10890) Provide implementable engine selector

2015-06-09 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580028#comment-14580028
 ] 

Navis commented on HIVE-10890:
--

Right, that should also be checked. Included implementation was just for 
showing the intention. I'll think of a way to know the engine is configured 
properly. Anyway, I don't know why I'm not notified these days from hive 
community.

 Provide implementable engine selector
 -

 Key: HIVE-10890
 URL: https://issues.apache.org/jira/browse/HIVE-10890
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial

 Now hive supports three kind of engines. It would be good to have an 
 automatic engine selector without setting explicitly engine for execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11002) Memory leakage on unsafe aggregation path with empty input

2015-06-13 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis resolved HIVE-11002.
--
Resolution: Invalid

Sorry, this meant to be in spark.

 Memory leakage on unsafe aggregation path with empty input
 --

 Key: HIVE-11002
 URL: https://issues.apache.org/jira/browse/HIVE-11002
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Navis
Assignee: Navis
Priority: Minor

 Currently, unsafe-based hash is released on 'next' call but if input is 
 empty, it would not be called ever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11518) Provide interface to adjust required resource for tez tasks

2015-08-21 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706695#comment-14706695
 ] 

Navis commented on HIVE-11518:
--

[~hagleitn] Any interest on this? I could have assigned 4G for 20+ join map 
tasks by just assigning 1G for other simple tasks.

 Provide interface to adjust required resource for tez tasks
 ---

 Key: HIVE-11518
 URL: https://issues.apache.org/jira/browse/HIVE-11518
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-11518.1.patch.txt


 Resource requirements for each tasks are varied but currently it's fixed to 
 one value(via hive.tez.container.size). It would be good to customize 
 resource requirements appropriate to expected work.
 Suggested interface is quite simple.
 {code}
 public interface ResourceCalculator {
   Resource adjust(Resource resource, MapWork mapWork);
   Resource adjust(Resource resource, ReduceWork reduceWork);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner

2015-08-21 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706694#comment-14706694
 ] 

Navis commented on HIVE-11515:
--

[~sseth] Sorry for delay. 
I've seen this a month ago in PoC scenario sometimes and desperately made this 
patch in a hurry. After applying it, those things gone and I just forget it 
(there was so many issues). So I cannot remember what was the exact problem, 
but it seemed query hang situation I guess. Sorry for my vague description.

 Still some possible race condition in DynamicPartitionPruner
 

 Key: HIVE-11515
 URL: https://issues.apache.org/jira/browse/HIVE-11515
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Tez
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-11515.1.patch.txt


 Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to 
 reproduce but it seemed related to the fact that prune() is called by 
 thread-pool. With some delay in queue, events from fast tasks are arrived 
 before prune() is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10890) Provide implementable engine selector

2015-08-23 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708627#comment-14708627
 ] 

Navis commented on HIVE-10890:
--

[~nemon] It's named as selector but you can implement more sophisticated 
strategy also in it.

 Provide implementable engine selector
 -

 Key: HIVE-10890
 URL: https://issues.apache.org/jira/browse/HIVE-10890
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial

 Now hive supports three kind of engines. It would be good to have an 
 automatic engine selector without setting explicitly engine for execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10890) Provide implementable engine selector

2015-08-23 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-10890:
-
Attachment: HIVE-10890.1.patch.txt

 Provide implementable engine selector
 -

 Key: HIVE-10890
 URL: https://issues.apache.org/jira/browse/HIVE-10890
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-10890.1.patch.txt


 Now hive supports three kind of engines. It would be good to have an 
 automatic engine selector without setting explicitly engine for execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8319) Add configuration for custom services in hiveserver2

2015-08-23 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-8319:

Attachment: HIVE-8319.4.patch.txt

 Add configuration for custom services in hiveserver2
 

 Key: HIVE-8319
 URL: https://issues.apache.org/jira/browse/HIVE-8319
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-8319.1.patch.txt, HIVE-8319.2.patch.txt, 
 HIVE-8319.3.patch.txt, HIVE-8319.4.patch.txt


 NO PRECOMMIT TESTS
 Register services to hiveserver2, for example, 
 {noformat}
 property
   namehive.server2.service.classesname
   
 valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue
 /property
 property
   nameazkaban.ssl.portname
   name...name
 /property
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8319) Add configuration for custom services in hiveserver2

2015-08-23 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708669#comment-14708669
 ] 

Navis commented on HIVE-8319:
-

[~thejas] Do you still have an interest to get this into hive? 

 Add configuration for custom services in hiveserver2
 

 Key: HIVE-8319
 URL: https://issues.apache.org/jira/browse/HIVE-8319
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-8319.1.patch.txt, HIVE-8319.2.patch.txt, 
 HIVE-8319.3.patch.txt, HIVE-8319.4.patch.txt


 NO PRECOMMIT TESTS
 Register services to hiveserver2, for example, 
 {noformat}
 property
   namehive.server2.service.classesname
   
 valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue
 /property
 property
   nameazkaban.ssl.portname
   name...name
 /property
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8319) Add configuration for custom services in hiveserver2

2015-08-23 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-8319:

Attachment: HIVE-8319.4.patch.txt

 Add configuration for custom services in hiveserver2
 

 Key: HIVE-8319
 URL: https://issues.apache.org/jira/browse/HIVE-8319
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-8319.1.patch.txt, HIVE-8319.2.patch.txt, 
 HIVE-8319.3.patch.txt, HIVE-8319.4.patch.txt


 NO PRECOMMIT TESTS
 Register services to hiveserver2, for example, 
 {noformat}
 property
   namehive.server2.service.classesname
   
 valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue
 /property
 property
   nameazkaban.ssl.portname
   name...name
 /property
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11518) Provide interface to adjust required resource for tez tasks

2015-08-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11518:
-
Attachment: HIVE-11518.1.patch.txt

 Provide interface to adjust required resource for tez tasks
 ---

 Key: HIVE-11518
 URL: https://issues.apache.org/jira/browse/HIVE-11518
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-11518.1.patch.txt


 Resource requirements for each tasks are varied but currently it's fixed to 
 one value(via hive.tez.container.size). It would be good to customize 
 resource requirements appropriate to expected work.
 Suggested interface is quite simple.
 {code}
 public interface ResourceCalculator {
   Resource adjust(Resource resource, MapWork mapWork);
   Resource adjust(Resource resource, ReduceWork reduceWork);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner

2015-08-10 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11515:
-
Attachment: HIVE-11515.1.patch.txt

 Still some possible race condition in DynamicPartitionPruner
 

 Key: HIVE-11515
 URL: https://issues.apache.org/jira/browse/HIVE-11515
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-11515.1.patch.txt


 Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to 
 reproduce but it seemed related to the fact that prune() is called by 
 thread-pool. With some delay in queue, events from fast tasks are arrived 
 before prune() is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner

2015-08-10 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11515:
-
Description: Even after HIVE-9976, I could see race condition in DPP 
sometimes. Hard to reproduce but it seemed related to the fact that prune() is 
called by thread-pool. With some delay in queue, events from fast tasks are 
arrived before prune() is called.  (was: Even after HIVE-9976, I could see race 
condition in DPP sometimes. Hard to reproduce but it seemed related to the fact 
that init() is called by thread-pool. With some delay in queue, events from 
fast tasks are arrived before init() is called.)

 Still some possible race condition in DynamicPartitionPruner
 

 Key: HIVE-11515
 URL: https://issues.apache.org/jira/browse/HIVE-11515
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-11515.1.patch.txt


 Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to 
 reproduce but it seemed related to the fact that prune() is called by 
 thread-pool. With some delay in queue, events from fast tasks are arrived 
 before prune() is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11506) Casting varchar/char type to string cannot be vectorized

2015-08-10 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11506:
-
Attachment: HIVE-11506.2.patch.txt

Updated golden files

 Casting varchar/char type to string cannot be vectorized
 

 Key: HIVE-11506
 URL: https://issues.apache.org/jira/browse/HIVE-11506
 Project: Hive
  Issue Type: Improvement
  Components: Vectorization
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-11506.1.patch.txt, HIVE-11506.2.patch.txt


 It's not defined in vectorization context.
 {code}
 explain 
 select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order 
 by x;
 {code}
 Mapper is not vectorized by exception,
 {noformat}
 015-08-10 17:02:08,003 INFO  [main]: physical.Vectorizer 
 (Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize
 org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: 
 varchar(10)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10931) Wrong columns selected on multiple joins

2015-08-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis resolved HIVE-10931.
--
Resolution: Cannot Reproduce

Feel free to open this if it happens again.

 Wrong columns selected on multiple joins
 

 Key: HIVE-10931
 URL: https://issues.apache.org/jira/browse/HIVE-10931
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0
 Environment: Cloudera cdh5.4.2
Reporter: Furcy Pin
 Fix For: 1.2.1


 The following set of queries :
 {code:sql}
 DROP TABLE IF EXISTS test1 ;
 DROP TABLE IF EXISTS test2 ;
 DROP TABLE IF EXISTS test3 ;
 CREATE TABLE test1 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 
 STRING, col6 STRING) ;
 INSERT INTO TABLE test1 VALUES (1,NULL,NULL,NULL,NULL,A) ;
 CREATE TABLE test2 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 
 STRING, col6 STRING) ;
 INSERT INTO TABLE test2 VALUES (1,NULL,NULL,NULL,NULL,X) ;
 CREATE TABLE test3 (coL1 STRING) ;
 INSERT INTO TABLE test3 VALUES (A) ;
 SELECT
   T2.val
 FROM test1 T1
 LEFT JOIN (SELECT col1, col2, col3, col4, col5,  COALESCE(col6,) as val 
 FROM test2) T2
 ON T2.col1 = T1.col1
 LEFT JOIN test3 T3  
 ON T3.col1 = T1.col6 
 ;
 {code}
 will return this :
 {noformat}
 +--+--+
 | t2.val   |
 +--+--+
 | A|
 +--+--+
 {noformat}
 Obviously, this result is wrong as table `test2` contains a X and no A.
 This is the most minimal example we found of this issue, in particular
 having less than 6 columns in the tables will work, for instance :
 {code:sql}
 SELECT
   T2.val
 FROM test1 T1
 LEFT JOIN (SELECT col1, col2, col3, col4, COALESCE(col6,) as val FROM 
 test2) T2
 ON T2.col1 = T1.col1
 LEFT JOIN test3 T3  
 ON T3.col1 = T1.col6 
 ;
 {code}
 (same query as before, but `col5` was removed from the select)
 will return :
 {noformat}
 +--+--+
 | t2.val   |
 +--+--+
 | X|
 +--+--+
 {noformat}
 Removing the `COALESCE` also removes the bug...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11176) aused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to [Ljava.lang.Object;

2015-08-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11176:
-
Attachment: HIVE-11176.1.patch.txt

Trivial fix

 aused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to 
 [Ljava.lang.Object;
 ---

 Key: HIVE-11176
 URL: https://issues.apache.org/jira/browse/HIVE-11176
 Project: Hive
  Issue Type: Bug
  Components: Hive, Tez
Affects Versions: 1.0.0, 1.2.0
 Environment: Hive 1.2 and TEz 0.7
Reporter: Soundararajan Velu
Priority: Critical
 Attachments: HIVE-11176.1.patch.txt


 Unreachable code: 
 hive/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardStructObjectInspector.java
 // With Data
   @Override
   @SuppressWarnings(unchecked)
   public Object getStructFieldData(Object data, StructField fieldRef) {
 if (data == null) {
   return null;
 }
 // We support both ListObject and Object[]
 // so we have to do differently.
 boolean isArray = ! (data instanceof List);
 if (!isArray  !(data instanceof List)) {
   return data;
 }
 *
 The if condition above translates to 
 if(!true  true) the code section cannot be reached,
 this causes a lot of class cast exceptions while using Tez or ORC file 
 formats or custom jsonsede, Strangely this happens only while using Tez. 
 Changed the code to 
  boolean isArray = data.getClass().isArray();
 if (!isArray  !(data instanceof List)) {
   return data;
 }
 Even then, lazystructs get passed as fields causing downstream cast 
 exceptions like lazystruct cannot be cast to Text etc...
 So I changed the method to something like this,
  // With Data
   @Override
   @SuppressWarnings(unchecked)
   public Object getStructFieldData(Object data, StructField fieldRef) {
 if (data == null) {
   return null;
 }
 if (data instanceof LazyBinaryStruct) {
 data = ((LazyBinaryStruct) data).getFieldsAsList();
 }
 // We support both ListObject and Object[]
 // so we have to do differently.
 boolean isArray = data.getClass().isArray();
 if (!isArray  !(data instanceof List)) {
   return data;
 }
 This is causing arrayindexout of bounds exception and other typecast 
 exceptions in object inspectors,
 Please help,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11506) Casting varchar/char type to string cannot be vectorized

2015-08-10 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11506:
-
Attachment: HIVE-11506.1.patch.txt

 Casting varchar/char type to string cannot be vectorized
 

 Key: HIVE-11506
 URL: https://issues.apache.org/jira/browse/HIVE-11506
 Project: Hive
  Issue Type: Improvement
  Components: Vectorization
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-11506.1.patch.txt


 It's not defined in vectorization context.
 {code}
 explain 
 select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order 
 by x;
 {code}
 Mapper is not vectorized by exception,
 {noformat}
 015-08-10 17:02:08,003 INFO  [main]: physical.Vectorizer 
 (Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize
 org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: 
 varchar(10)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11506) Casting varchar/char type to string cannot be vectorized

2015-08-10 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11506:
-
Description: 
It's not defined in vectorization context.
{code}
explain 
select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order 
by x;
{code}

Mapper is not vectorized by exception,
{noformat}
015-08-10 17:02:08,003 INFO  [main]: physical.Vectorizer 
(Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize
org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: 
varchar(10)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906)
{noformat}

  was:
It's not defined in vectorization context.
{code}
explain 
select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order 
by x;
{code}

Mapper 
{noformat}
015-08-10 17:02:08,003 INFO  [main]: physical.Vectorizer 
(Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize
org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: 
varchar(10)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116)
at 
org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906)
{noformat}


 Casting varchar/char type to string cannot be vectorized
 

 Key: HIVE-11506
 URL: https://issues.apache.org/jira/browse/HIVE-11506
 Project: Hive
  Issue Type: Improvement
  Components: Vectorization
Reporter: Navis
Assignee: Navis
Priority: Trivial

 It's not defined in vectorization context.
 {code}
 explain 
 select cast(cast(cstring1 as varchar(10)) as string) x from alltypesorc order 
 by x;
 {code}
 Mapper is not vectorized by exception,
 {noformat}
 015-08-10 17:02:08,003 INFO  [main]: physical.Vectorizer 
 (Vectorizer.java:validateExprNodeDesc(1299)) - Failed to vectorize
 org.apache.hadoop.hive.ql.metadata.HiveException: Unhandled cast input type: 
 varchar(10)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getCastToString(VectorizationContext.java:1543)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUDFBridgeVectorExpression(VectorizationContext.java:1379)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1177)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1293)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateExprNodeDesc(Vectorizer.java:1284)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateSelectOperator(Vectorizer.java:1116)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateMapWorkOperator(Vectorizer.java:906)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner

2015-08-24 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710436#comment-14710436
 ] 

Navis commented on HIVE-11515:
--

[~sseth] If it's already fixed, seemed not need to commit this. Thanks!

 Still some possible race condition in DynamicPartitionPruner
 

 Key: HIVE-11515
 URL: https://issues.apache.org/jira/browse/HIVE-11515
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Tez
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-11515.1.patch.txt


 Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to 
 reproduce but it seemed related to the fact that prune() is called by 
 thread-pool. With some delay in queue, events from fast tasks are arrived 
 before prune() is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow

2015-10-24 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7575:

Attachment: HIVE-7575.2.patch.txt

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7575) GetTables thrift call is very slow

2015-10-23 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971197#comment-14971197
 ] 

Navis commented on HIVE-7575:
-

getTables() is the first call from most BI tools but it takes so much time with 
100+ databases. I think it's worth to make a dedicated API in metastore for 
this.

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow

2015-10-23 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7575:

Attachment: HIVE-7575.1.patch.txt

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
> Attachments: HIVE-7575.1.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow

2015-10-24 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7575:

Attachment: HIVE-7575.3.patch.txt

Fixed test fails

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, 
> HIVE-7575.3.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7575) GetTables thrift call is very slow

2015-10-27 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977451#comment-14977451
 ] 

Navis commented on HIVE-7575:
-

[~ychena] Sorry for my bad description. Retrieving all the Table instances from 
metastore is the the root cause of this, Not databases. For my case, I have 
3000+ tables in 100+ databases.

[~szehon] I thought at first "table_types" was also a patten like other params 
but actually it was not. Would it be better to be String[] or List?

[~aihuaxu] There are already some test cases (TestJdbcDriver2, for example) but 
I think I can add some more. Thanks.

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, 
> HIVE-7575.3.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow

2015-10-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7575:

Attachment: HIVE-7575.4.patch.txt

Addressed comments. Let's see test result.

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, 
> HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11702) GetSchemas thrift call is slow on scale of 1000+ databases

2015-10-28 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977861#comment-14977861
 ] 

Navis commented on HIVE-11702:
--

[~erickt] Added short path for getSchemas(null) in recent patch of HIVE-7575. 
Didn't confirmed the effect.

> GetSchemas thrift call is slow on scale of 1000+ databases
> --
>
> Key: HIVE-11702
> URL: https://issues.apache.org/jira/browse/HIVE-11702
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.1
>Reporter: Jenny Kim
> Attachments: HIVE-11702.1.patch.txt
>
>
> Similar to https://issues.apache.org/jira/browse/HIVE-7575 GetSchemas also 
> starts to degrade in latency starting at the order of 1000+ databases, which 
> returned in about 30 seconds.
> However, SHOW DATABASES on the same Hive instance returns within a few 
> seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow

2015-10-28 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7575:

Attachment: HIVE-7575.5.patch.txt

Added short path for getSchemas(null). see HIVE-11702

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, 
> HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt, HIVE-7575.5.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11756) Avoid redundant key serialization in RS for distinct query

2015-10-28 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977866#comment-14977866
 ] 

Navis commented on HIVE-11756:
--

cannot reproduce fail of index_bitmap_auto. others seemed not related.

> Avoid redundant key serialization in RS for distinct query
> --
>
> Key: HIVE-11756
> URL: https://issues.apache.org/jira/browse/HIVE-11756
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11756.1.patch.txt, HIVE-11756.2.patch.txt, 
> HIVE-11756.3.patch.txt, HIVE-11756.4.patch.txt
>
>
> Currently hive serializes twice to know the length of distribution key for 
> distinct queries. This introduces IndexedSerializer to avoid this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow

2015-10-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7575:

Attachment: HIVE-7575.4.patch.txt

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, 
> HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow

2015-10-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7575:

Attachment: (was: HIVE-7575.4.patch.txt)

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, 
> HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow

2015-10-28 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7575:

Attachment: HIVE-7575.4.patch.txt

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, 
> HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow

2015-10-28 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7575:

Attachment: (was: HIVE-7575.4.patch.txt)

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, 
> HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow

2015-10-28 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7575:

Attachment: HIVE-7575.6.patch.txt

Rebased to trunk & addressed comment

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, 
> HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt, HIVE-7575.5.patch.txt, 
> HIVE-7575.6.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7575) GetTables thrift call is very slow

2015-10-28 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979848#comment-14979848
 ] 

Navis commented on HIVE-7575:
-

[~aihuaxu] Wanted method signature to be simplistic but so be it. (used 
TableMetaData by mistake. I'll change it to TableMeta in next patch)

bq. To Yongzhi's question: when we have many databases, the performance of the 
original getTables could be bad since we are making at least one trip for each 
database. Is that right?

It would be one of the root cause. But seeing HIVE-11702, it takes much time 
though getSchema(null) uses just one call to metastore. Pattern matching query 
seems much more expensive than expected(Even with simple * pattern) 

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, 
> HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt, HIVE-7575.5.patch.txt, 
> HIVE-7575.6.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12388) GetTables cannot get external tables when TABLE type argument is given

2015-11-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-12388:
-
Attachment: HIVE-12388.1.patch.txt

> GetTables cannot get external tables when TABLE type argument is given
> --
>
> Key: HIVE-12388
> URL: https://issues.apache.org/jira/browse/HIVE-12388
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Navis
>Assignee: Navis
>Priority: Critical
> Attachments: HIVE-12388.1.patch.txt
>
>
> By regression of HIVE-7575, external tables are not shown when "TABLE" type 
> is specified as argument. I'm working on this. Sorry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12373) Interner should return identical map or list

2015-11-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-12373:
-
Attachment: HIVE-12373.1.patch.txt

> Interner should return identical map or list
> 
>
> Key: HIVE-12373
> URL: https://issues.apache.org/jira/browse/HIVE-12373
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-12373.1.patch.txt
>
>
> Currently, HiveStringUtils.intern(map/list) returns new instance of map or 
> list. But it would break some usage style of code something like below (it's 
> spark code in HiveMetastoreCatalog)
> {code}
> val serdeParameters = new java.util.HashMap[String, String]()
> serdeInfo.setParameters(serdeParameters)
> // these properties will be gone
> table.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) }
> p.storage.serdeProperties.foreach { case (k, v) => serdeParameters.put(k, v) }
> {code}
> Luckily for spark, interner was not applied to released version of hive 
> (1.2.0, 1.2.1) by mistake. But it would make problem in someday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7575) GetTables thrift call is very slow

2015-11-04 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7575:

Attachment: HIVE-7575.7.patch.txt

Rebased to trunk

> GetTables thrift call is very slow
> --
>
> Key: HIVE-7575
> URL: https://issues.apache.org/jira/browse/HIVE-7575
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Ashu Pachauri
>Assignee: Navis
> Attachments: HIVE-7575.1.patch.txt, HIVE-7575.2.patch.txt, 
> HIVE-7575.3.patch.txt, HIVE-7575.4.patch.txt, HIVE-7575.5.patch.txt, 
> HIVE-7575.6.patch.txt, HIVE-7575.7.patch.txt
>
>
> The GetTables thrift call takes a long time when the number of table is large.
> With around 5000 tables, the call takes around 80 seconds compared to a "Show 
> Tables" query on the same HiveServer2 instance which takes 3-7 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments

2015-10-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-12182:
-
Attachment: HIVE-12182.1.patch.txt

> ALTER TABLE PARTITION COLUMN does not set partition column comments
> ---
>
> Key: HIVE-12182
> URL: https://issues.apache.org/jira/browse/HIVE-12182
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Navis
> Attachments: HIVE-12182.1.patch.txt
>
>
> ALTER TABLE PARTITION COLUMN does not set partition column comments. The 
> syntax is accepted, but the COMMENT for the column is ignored.
> {code}
> 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment 
> 'HELLO') partitioned by (j int comment 'WORLD');
> No rows affected (0.104 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   | WORLD |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   | WORLD |
> +--+---+---+--+
> 7 rows selected (0.109 seconds)
> 0: jdbc:hive2://localhost:1/default> alter table part_test partition 
> column (j int comment 'WIDE');
> No rows affected (0.121 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   |   |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   |   |
> +--+---+---+--+
> 7 rows selected (0.108 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments

2015-10-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis reassigned HIVE-12182:


Assignee: Navis  (was: Naveen Gangam)

> ALTER TABLE PARTITION COLUMN does not set partition column comments
> ---
>
> Key: HIVE-12182
> URL: https://issues.apache.org/jira/browse/HIVE-12182
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Navis
> Attachments: HIVE-12182.1.patch.txt
>
>
> ALTER TABLE PARTITION COLUMN does not set partition column comments. The 
> syntax is accepted, but the COMMENT for the column is ignored.
> {code}
> 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment 
> 'HELLO') partitioned by (j int comment 'WORLD');
> No rows affected (0.104 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   | WORLD |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   | WORLD |
> +--+---+---+--+
> 7 rows selected (0.109 seconds)
> 0: jdbc:hive2://localhost:1/default> alter table part_test partition 
> column (j int comment 'WIDE');
> No rows affected (0.121 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   |   |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   |   |
> +--+---+---+--+
> 7 rows selected (0.108 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments

2015-10-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-12182:
-
Assignee: Naveen Gangam  (was: Navis)

> ALTER TABLE PARTITION COLUMN does not set partition column comments
> ---
>
> Key: HIVE-12182
> URL: https://issues.apache.org/jira/browse/HIVE-12182
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
>
> ALTER TABLE PARTITION COLUMN does not set partition column comments. The 
> syntax is accepted, but the COMMENT for the column is ignored.
> {code}
> 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment 
> 'HELLO') partitioned by (j int comment 'WORLD');
> No rows affected (0.104 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   | WORLD |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   | WORLD |
> +--+---+---+--+
> 7 rows selected (0.109 seconds)
> 0: jdbc:hive2://localhost:1/default> alter table part_test partition 
> column (j int comment 'WIDE');
> No rows affected (0.121 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   |   |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   |   |
> +--+---+---+--+
> 7 rows selected (0.108 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments

2015-10-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-12182:
-
Attachment: (was: HIVE-12182.1.patch.txt)

> ALTER TABLE PARTITION COLUMN does not set partition column comments
> ---
>
> Key: HIVE-12182
> URL: https://issues.apache.org/jira/browse/HIVE-12182
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
>
> ALTER TABLE PARTITION COLUMN does not set partition column comments. The 
> syntax is accepted, but the COMMENT for the column is ignored.
> {code}
> 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment 
> 'HELLO') partitioned by (j int comment 'WORLD');
> No rows affected (0.104 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   | WORLD |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   | WORLD |
> +--+---+---+--+
> 7 rows selected (0.109 seconds)
> 0: jdbc:hive2://localhost:1/default> alter table part_test partition 
> column (j int comment 'WIDE');
> No rows affected (0.121 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   |   |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   |   |
> +--+---+---+--+
> 7 rows selected (0.108 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-10-14 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11768:
-
Attachment: HIVE-11768.6.patch.txt

> java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
> 
>
> Key: HIVE-11768
> URL: https://issues.apache.org/jira/browse/HIVE-11768
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Navis
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt, 
> HIVE-11768.3.patch.txt, HIVE-11768.4.patch.txt, HIVE-11768.5.patch.txt, 
> HIVE-11768.6.patch.txt
>
>
>   More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".pipeout".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11518) Provide interface to adjust required resource for tez tasks

2015-10-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11518:
-
Attachment: HIVE-11518.2.patch.txt

Rebased to trunk

> Provide interface to adjust required resource for tez tasks
> ---
>
> Key: HIVE-11518
> URL: https://issues.apache.org/jira/browse/HIVE-11518
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-11518.1.patch.txt, HIVE-11518.2.patch.txt
>
>
> Resource requirements for each tasks are varied but currently it's fixed to 
> one value(via hive.tez.container.size). It would be good to customize 
> resource requirements appropriate to expected work.
> Suggested interface is quite simple.
> {code}
> public interface ResourceCalculator {
>   Resource adjust(Resource resource, MapWork mapWork);
>   Resource adjust(Resource resource, ReduceWork reduceWork);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-10-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11768:
-
Attachment: HIVE-11768.4.patch.txt

> java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
> 
>
> Key: HIVE-11768
> URL: https://issues.apache.org/jira/browse/HIVE-11768
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Navis
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt, 
> HIVE-11768.3.patch.txt, HIVE-11768.4.patch.txt
>
>
>   More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".pipeout".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11679) SemanticAnalysis of "a=1" can result in a new Configuration() object

2015-10-12 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954210#comment-14954210
 ] 

Navis commented on HIVE-11679:
--

[~ashutoshc] created RB but testNegativeCliDriver_compare_*_bigint seemed 
related. I'll look into this.

> SemanticAnalysis of "a=1" can result in a new Configuration() object
> 
>
> Key: HIVE-11679
> URL: https://issues.apache.org/jira/browse/HIVE-11679
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Navis
> Attachments: HIVE-11679.1.patch.txt, HIVE-11679.2.patch.txt
>
>
> {code}
> public static ExprNodeGenericFuncDesc newInstance(GenericUDF genericUDF,
>   String funcText,
>   List children) throws UDFArgumentException {
> ...
>  if (genericUDF instanceof GenericUDFBaseCompare && children.size() == 2) {
>   TypeInfo oiTypeInfo0 = children.get(0).getTypeInfo();
>   TypeInfo oiTypeInfo1 = children.get(1).getTypeInfo();
>   SessionState ss = SessionState.get();
>   Configuration conf = (ss != null) ? ss.getConf() : new Configuration();
> {code}
> This is both a SessionState.get() which is a threadlocal lookup or worse, a  
> new Configuration()  which means XML parsing of multiple files for each 
> equality expression in the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10890) Provide implementable engine selector

2015-10-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-10890:
-
Attachment: HIVE-10890.3.patch.txt

> Provide implementable engine selector
> -
>
> Key: HIVE-10890
> URL: https://issues.apache.org/jira/browse/HIVE-10890
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-10890.1.patch.txt, HIVE-10890.2.patch.txt, 
> HIVE-10890.3.patch.txt
>
>
> Now hive supports three kind of engines. It would be good to have an 
> automatic engine selector without setting explicitly engine for execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11679) SemanticAnalysis of "a=1" can result in a new Configuration() object

2015-10-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11679:
-
Attachment: HIVE-11679.2.patch.txt

> SemanticAnalysis of "a=1" can result in a new Configuration() object
> 
>
> Key: HIVE-11679
> URL: https://issues.apache.org/jira/browse/HIVE-11679
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Navis
> Attachments: HIVE-11679.1.patch.txt, HIVE-11679.2.patch.txt
>
>
> {code}
> public static ExprNodeGenericFuncDesc newInstance(GenericUDF genericUDF,
>   String funcText,
>   List children) throws UDFArgumentException {
> ...
>  if (genericUDF instanceof GenericUDFBaseCompare && children.size() == 2) {
>   TypeInfo oiTypeInfo0 = children.get(0).getTypeInfo();
>   TypeInfo oiTypeInfo1 = children.get(1).getTypeInfo();
>   SessionState ss = SessionState.get();
>   Configuration conf = (ss != null) ? ss.getConf() : new Configuration();
> {code}
> This is both a SessionState.get() which is a threadlocal lookup or worse, a  
> new Configuration()  which means XML parsing of multiple files for each 
> equality expression in the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-10-13 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11768:
-
Attachment: HIVE-11768.5.patch.txt

Addressed comments & fixed test fail

> java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
> 
>
> Key: HIVE-11768
> URL: https://issues.apache.org/jira/browse/HIVE-11768
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Navis
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt, 
> HIVE-11768.3.patch.txt, HIVE-11768.4.patch.txt, HIVE-11768.5.patch.txt
>
>
>   More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".pipeout".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-10-13 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956063#comment-14956063
 ] 

Navis commented on HIVE-11768:
--

[~thejas] I've left "FileSystem.deleteOnExit" problem for another issue because 
"FileSystem.close" will not be called ever. Could it be removed from code 
safely? I'm not sure on that.

> java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
> 
>
> Key: HIVE-11768
> URL: https://issues.apache.org/jira/browse/HIVE-11768
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Navis
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt, 
> HIVE-11768.3.patch.txt, HIVE-11768.4.patch.txt, HIVE-11768.5.patch.txt
>
>
>   More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".pipeout".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11679) SemanticAnalysis of "a=1" can result in a new Configuration() object

2015-10-13 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956079#comment-14956079
 ] 

Navis commented on HIVE-11679:
--

Strange.. I cannot reproduce fail of udaf_histogram_numeric

> SemanticAnalysis of "a=1" can result in a new Configuration() object
> 
>
> Key: HIVE-11679
> URL: https://issues.apache.org/jira/browse/HIVE-11679
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Navis
> Attachments: HIVE-11679.1.patch.txt, HIVE-11679.2.patch.txt, 
> HIVE-11679.3.patch.txt
>
>
> {code}
> public static ExprNodeGenericFuncDesc newInstance(GenericUDF genericUDF,
>   String funcText,
>   List children) throws UDFArgumentException {
> ...
>  if (genericUDF instanceof GenericUDFBaseCompare && children.size() == 2) {
>   TypeInfo oiTypeInfo0 = children.get(0).getTypeInfo();
>   TypeInfo oiTypeInfo1 = children.get(1).getTypeInfo();
>   SessionState ss = SessionState.get();
>   Configuration conf = (ss != null) ? ss.getConf() : new Configuration();
> {code}
> This is both a SessionState.get() which is a threadlocal lookup or worse, a  
> new Configuration()  which means XML parsing of multiple files for each 
> equality expression in the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11679) SemanticAnalysis of "a=1" can result in a new Configuration() object

2015-10-12 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11679:
-
Attachment: HIVE-11679.3.patch.txt

Just updated negative test results.

> SemanticAnalysis of "a=1" can result in a new Configuration() object
> 
>
> Key: HIVE-11679
> URL: https://issues.apache.org/jira/browse/HIVE-11679
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Navis
> Attachments: HIVE-11679.1.patch.txt, HIVE-11679.2.patch.txt, 
> HIVE-11679.3.patch.txt
>
>
> {code}
> public static ExprNodeGenericFuncDesc newInstance(GenericUDF genericUDF,
>   String funcText,
>   List children) throws UDFArgumentException {
> ...
>  if (genericUDF instanceof GenericUDFBaseCompare && children.size() == 2) {
>   TypeInfo oiTypeInfo0 = children.get(0).getTypeInfo();
>   TypeInfo oiTypeInfo1 = children.get(1).getTypeInfo();
>   SessionState ss = SessionState.get();
>   Configuration conf = (ss != null) ? ss.getConf() : new Configuration();
> {code}
> This is both a SessionState.get() which is a threadlocal lookup or worse, a  
> new Configuration()  which means XML parsing of multiple files for each 
> equality expression in the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11662) DP cannot be applied to external table which contains part-spec like directory

2015-08-31 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723042#comment-14723042
 ] 

Navis commented on HIVE-11662:
--

Failures seemed not related to this. I'll add some test cases.

> DP cannot be applied to external table which contains part-spec like directory
> --
>
> Key: HIVE-11662
> URL: https://issues.apache.org/jira/browse/HIVE-11662
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11662.1.patch.txt
>
>
> Some users want to use part-spec like directory name in their partitioned 
> table locations, something like,
> {noformat}
> /something/warehouse/some_key=some_value
> {noformat}
> DP calculates additional partitions from full path, and makes exception 
> something like,
> {noformat}
> Failed with exception Partition spec {some_key=some_value, 
> part_key=part_value} contains non-partition columns
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11662) DP cannot be applied to external table which contains part-spec like directory

2015-08-26 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11662:
-
Attachment: HIVE-11662.1.patch.txt

For preliminary test

 DP cannot be applied to external table which contains part-spec like directory
 --

 Key: HIVE-11662
 URL: https://issues.apache.org/jira/browse/HIVE-11662
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-11662.1.patch.txt


 Some users want to use part-spec like directory name in their partitioned 
 table locations, something like,
 {noformat}
 /something/warehouse/some_key=some_value
 {noformat}
 DP calculates additional partitions from full path, and makes exception 
 something like,
 {noformat}
 Failed with exception Partition spec {some_key=some_value, 
 part_key=part_value} contains non-partition columns
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.MoveTask
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11662) DP cannot be applied to external table which contains part-spec like directory

2015-08-31 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11662:
-
Attachment: HIVE-11662.2.patch.txt

> DP cannot be applied to external table which contains part-spec like directory
> --
>
> Key: HIVE-11662
> URL: https://issues.apache.org/jira/browse/HIVE-11662
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11662.1.patch.txt, HIVE-11662.2.patch.txt
>
>
> Some users want to use part-spec like directory name in their partitioned 
> table locations, something like,
> {noformat}
> /something/warehouse/some_key=some_value
> {noformat}
> DP calculates additional partitions from full path, and makes exception 
> something like,
> {noformat}
> Failed with exception Partition spec {some_key=some_value, 
> part_key=part_value} contains non-partition columns
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11706) Implement "show create database"

2015-08-31 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11706:
-
Attachment: HIVE-11706.1.patch.txt

> Implement "show create database"
> 
>
> Key: HIVE-11706
> URL: https://issues.apache.org/jira/browse/HIVE-11706
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11706.1.patch.txt
>
>
> HIVE-967 introduced "show create table". How about "show create database"?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11657) HIVE-2573 introduces some issues during metastore init (and CLI init)

2015-09-01 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725024#comment-14725024
 ] 

Navis commented on HIVE-11657:
--

[~sershe] Making static call in Hive was one of the my worst decision on hive. 
I was so tired of rebasing the patch for years and new things like permanent 
function bothered too much of me and I couldn't think of any better idea in 
that day. 

> HIVE-2573 introduces some issues during metastore init (and CLI init)
> -
>
> Key: HIVE-11657
> URL: https://issues.apache.org/jira/browse/HIVE-11657
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
> Attachments: HIVE-11657.patch
>
>
> HIVE-2573 introduced static reload functions call.
> It has a few problems:
> 1) When metastore client is initialized using an externally supplied config 
> (i.e. Hive.get(HiveConf)), it still gets called during static init using the 
> main service config. In my case, even though I have uris in the supplied 
> config to connect to remote MS (which eventually happens), the static call 
> creates objectstore, which is undesirable.
> 2) It breaks compat - old metastores do not support this call so new clients 
> will fail, and there's no workaround like not using a new feature because the 
> static call is always made



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11706) Implement "show create database"

2015-09-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11706:
-
Attachment: HIVE-11706.2.patch.txt

Fixed test fails

> Implement "show create database"
> 
>
> Key: HIVE-11706
> URL: https://issues.apache.org/jira/browse/HIVE-11706
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11706.1.patch.txt, HIVE-11706.2.patch.txt
>
>
> HIVE-967 introduced "show create table". How about "show create database"?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11662) Dynamic partitioning cannot be applied to external table which contains part-spec like directory name

2015-09-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11662:
-
Summary: Dynamic partitioning cannot be applied to external table which 
contains part-spec like directory name  (was: DP cannot be applied to external 
table which contains part-spec like directory)

> Dynamic partitioning cannot be applied to external table which contains 
> part-spec like directory name
> -
>
> Key: HIVE-11662
> URL: https://issues.apache.org/jira/browse/HIVE-11662
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11662.1.patch.txt, HIVE-11662.2.patch.txt
>
>
> Some users want to use part-spec like directory name in their partitioned 
> table locations, something like,
> {noformat}
> /something/warehouse/some_key=some_value
> {noformat}
> DP calculates additional partitions from full path, and makes exception 
> something like,
> {noformat}
> Failed with exception Partition spec {some_key=some_value, 
> part_key=part_value} contains non-partition columns
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11662) DP cannot be applied to external table which contains part-spec like directory

2015-09-07 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734152#comment-14734152
 ] 

Navis commented on HIVE-11662:
--

[~leftylev] Right. I'll rename the issue description.

> DP cannot be applied to external table which contains part-spec like directory
> --
>
> Key: HIVE-11662
> URL: https://issues.apache.org/jira/browse/HIVE-11662
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11662.1.patch.txt, HIVE-11662.2.patch.txt
>
>
> Some users want to use part-spec like directory name in their partitioned 
> table locations, something like,
> {noformat}
> /something/warehouse/some_key=some_value
> {noformat}
> DP calculates additional partitions from full path, and makes exception 
> something like,
> {noformat}
> Failed with exception Partition spec {some_key=some_value, 
> part_key=part_value} contains non-partition columns
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11752) Pre-materializing complex CTE queries

2015-09-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11752:
-
Attachment: HIVE-11752.1.patch.txt

Patch for preliminary test.

> Pre-materializing complex CTE queries
> -
>
> Key: HIVE-11752
> URL: https://issues.apache.org/jira/browse/HIVE-11752
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-11752.1.patch.txt
>
>
> Currently, hive regards CTE clauses as a simple alias to the query block, 
> which makes redundant works if it's used multiple times in a query. This 
> introduces a reference threshold for pre-materializing the CTE clause as a 
> volatile table (which is not exists in any form of metastore and just 
> accessible from QB).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11754) Not reachable code parts in StatsUtils

2015-09-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11754:
-
Attachment: HIVE-11754.1.patch.txt

> Not reachable code parts in StatsUtils
> --
>
> Key: HIVE-11754
> URL: https://issues.apache.org/jira/browse/HIVE-11754
> Project: Hive
>  Issue Type: Task
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11754.1.patch.txt
>
>
> No need to check "oi instanceof WritableConstantHiveCharObjectInspector" 
> after "oi instanceof ConstantObjectInspector".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10890) Provide implementable engine selector

2015-09-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-10890:
-
Attachment: HIVE-10890.2.patch.txt

rebased to trunk

> Provide implementable engine selector
> -
>
> Key: HIVE-10890
> URL: https://issues.apache.org/jira/browse/HIVE-10890
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-10890.1.patch.txt, HIVE-10890.2.patch.txt
>
>
> Now hive supports three kind of engines. It would be good to have an 
> automatic engine selector without setting explicitly engine for execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11756) Avoid redundant key serialization in RS for distinct query

2015-09-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11756:
-
Attachment: HIVE-11756.1.patch.txt

Attaching patch for preliminary test

> Avoid redundant key serialization in RS for distinct query
> --
>
> Key: HIVE-11756
> URL: https://issues.apache.org/jira/browse/HIVE-11756
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11756.1.patch.txt
>
>
> Currently hive serializes twice to know the length of distribution key for 
> distinct queries. This introduces IndexedSerializer to avoid this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11756) Avoid redundant key serialization in RS for distinct query

2015-09-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11756:
-
Attachment: HIVE-11756.2.patch.txt

> Avoid redundant key serialization in RS for distinct query
> --
>
> Key: HIVE-11756
> URL: https://issues.apache.org/jira/browse/HIVE-11756
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11756.1.patch.txt, HIVE-11756.2.patch.txt
>
>
> Currently hive serializes twice to know the length of distribution key for 
> distinct queries. This introduces IndexedSerializer to avoid this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11754) Not reachable code parts in StatsUtils

2015-09-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11754:
-
Attachment: HIVE-11754.2.patch.txt

> Not reachable code parts in StatsUtils
> --
>
> Key: HIVE-11754
> URL: https://issues.apache.org/jira/browse/HIVE-11754
> Project: Hive
>  Issue Type: Task
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11754.1.patch.txt, HIVE-11754.2.patch.txt
>
>
> No need to check "oi instanceof WritableConstantHiveCharObjectInspector" 
> after "oi instanceof ConstantObjectInspector".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-09-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11768:
-
Attachment: HIVE-11768.1.patch.txt

> java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
> 
>
> Key: HIVE-11768
> URL: https://issues.apache.org/jira/browse/HIVE-11768
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
> Attachments: HIVE-11768.1.patch.txt
>
>
>   More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".piepout".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-09-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis reassigned HIVE-11768:


Assignee: Navis

> java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
> 
>
> Key: HIVE-11768
> URL: https://issues.apache.org/jira/browse/HIVE-11768
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Navis
> Attachments: HIVE-11768.1.patch.txt
>
>
>   More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".piepout".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11756) Avoid redundant key serialization in RS for distinct query

2015-09-11 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11756:
-
Attachment: HIVE-11756.3.patch.txt

Fixed test fails

> Avoid redundant key serialization in RS for distinct query
> --
>
> Key: HIVE-11756
> URL: https://issues.apache.org/jira/browse/HIVE-11756
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11756.1.patch.txt, HIVE-11756.2.patch.txt, 
> HIVE-11756.3.patch.txt
>
>
> Currently hive serializes twice to know the length of distribution key for 
> distinct queries. This introduces IndexedSerializer to avoid this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11774) Show macro definition for desc function

2015-09-10 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14740230#comment-14740230
 ] 

Navis commented on HIVE-11774:
--

[~damien.carol] it seemed not. 
{noformat}
descStatement
@init { pushMsg("describe statement", state); }
@after { popMsg(state); }
:
(KW_DESCRIBE|KW_DESC)
(
(KW_DATABASE|KW_SCHEMA) => (KW_DATABASE|KW_SCHEMA) KW_EXTENDED? 
(dbName=identifier) -> ^(TOK_DESCDATABASE $dbName KW_EXTENDED?)
|
(KW_FUNCTION) => KW_FUNCTION KW_EXTENDED? (name=descFuncNames) -> 
^(TOK_DESCFUNCTION $name KW_EXTENDED?)
|
(KW_FORMATTED|KW_EXTENDED|KW_PRETTY) => 
((descOptions=KW_FORMATTED|descOptions=KW_EXTENDED|descOptions=KW_PRETTY) 
parttype=partTypeExpr) -> ^(TOK_DESCTABLE $parttype $descOptions)
|
parttype=partTypeExpr -> ^(TOK_DESCTABLE $parttype)
)
;
{noformat}

Possibly support KW_MACRO, too. 

> Show macro definition for desc function 
> 
>
> Key: HIVE-11774
> URL: https://issues.apache.org/jira/browse/HIVE-11774
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11774.1.patch.txt, HIVE-11774.2.patch.txt
>
>
> Currently, desc function shows nothing for macro. It would be helpful if it 
> shows the definition of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11774) Show macro definition for desc function

2015-09-10 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11774:
-
Attachment: HIVE-11774.2.patch.txt

Fixed test fails

> Show macro definition for desc function 
> 
>
> Key: HIVE-11774
> URL: https://issues.apache.org/jira/browse/HIVE-11774
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11774.1.patch.txt, HIVE-11774.2.patch.txt
>
>
> Currently, desc function shows nothing for macro. It would be helpful if it 
> shows the definition of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-09-10 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11768:
-
Attachment: HIVE-11768.2.patch.txt

changed to synchronized set & minimized diff

> java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
> 
>
> Key: HIVE-11768
> URL: https://issues.apache.org/jira/browse/HIVE-11768
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Navis
> Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt
>
>
>   More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".pipeout".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-09-11 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14740251#comment-14740251
 ] 

Navis commented on HIVE-11768:
--

[~nemon] Thank for the report!

> java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
> 
>
> Key: HIVE-11768
> URL: https://issues.apache.org/jira/browse/HIVE-11768
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Navis
> Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt
>
>
>   More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".pipeout".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11679) SemanticAnalysis of "a=1" can result in a new Configuration() object

2015-09-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11679:
-
Attachment: HIVE-11679.1.patch.txt

Attaching patch for preliminary test

> SemanticAnalysis of "a=1" can result in a new Configuration() object
> 
>
> Key: HIVE-11679
> URL: https://issues.apache.org/jira/browse/HIVE-11679
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
> Attachments: HIVE-11679.1.patch.txt
>
>
> {code}
> public static ExprNodeGenericFuncDesc newInstance(GenericUDF genericUDF,
>   String funcText,
>   List children) throws UDFArgumentException {
> ...
>  if (genericUDF instanceof GenericUDFBaseCompare && children.size() == 2) {
>   TypeInfo oiTypeInfo0 = children.get(0).getTypeInfo();
>   TypeInfo oiTypeInfo1 = children.get(1).getTypeInfo();
>   SessionState ss = SessionState.get();
>   Configuration conf = (ss != null) ? ss.getConf() : new Configuration();
> {code}
> This is both a SessionState.get() which is a threadlocal lookup or worse, a  
> new Configuration()  which means XML parsing of multiple files for each 
> equality expression in the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11752) Pre-materializing complex CTE queries

2015-09-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11752:
-
Attachment: HIVE-11752.2.patch.txt

Fixed missing read/write entities

> Pre-materializing complex CTE queries
> -
>
> Key: HIVE-11752
> URL: https://issues.apache.org/jira/browse/HIVE-11752
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-11752.1.patch.txt, HIVE-11752.2.patch.txt
>
>
> Currently, hive regards CTE clauses as a simple alias to the query block, 
> which makes redundant works if it's used multiple times in a query. This 
> introduces a reference threshold for pre-materializing the CTE clause as a 
> volatile table (which is not exists in any form of metastore and just 
> accessible from QB).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11706) Implement "show create database"

2015-09-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11706:
-
Attachment: HIVE-11706.3.patch.txt

Fixed test fails

> Implement "show create database"
> 
>
> Key: HIVE-11706
> URL: https://issues.apache.org/jira/browse/HIVE-11706
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11706.1.patch.txt, HIVE-11706.2.patch.txt, 
> HIVE-11706.3.patch.txt
>
>
> HIVE-967 introduced "show create table". How about "show create database"?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11752) Pre-materializing complex CTE queries

2016-02-01 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11752:
-
Assignee: Jesus Camacho Rodriguez  (was: Navis)

> Pre-materializing complex CTE queries
> -
>
> Key: HIVE-11752
> URL: https://issues.apache.org/jira/browse/HIVE-11752
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 2.1.0
>Reporter: Navis
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-11752.03.patch, HIVE-11752.04.patch, 
> HIVE-11752.1.patch.txt, HIVE-11752.2.patch.txt
>
>
> Currently, hive regards CTE clauses as a simple alias to the query block, 
> which makes redundant works if it's used multiple times in a query. This 
> introduces a reference threshold for pre-materializing the CTE clause as a 
> volatile table (which is not exists in any form of metastore and just 
> accessible from QB).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11752) Pre-materializing complex CTE queries

2016-02-01 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127375#comment-15127375
 ] 

Navis commented on HIVE-11752:
--

[~jcamachorodriguez] Good to hear that someone got interested in this. It might 
be views that make me hard to complete this but I couldn't remember exact 
reason. Wishfully you can finish this to trunk because this can be major 
speed-up factor for complex DW queries.

> Pre-materializing complex CTE queries
> -
>
> Key: HIVE-11752
> URL: https://issues.apache.org/jira/browse/HIVE-11752
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 2.1.0
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-11752.03.patch, HIVE-11752.04.patch, 
> HIVE-11752.1.patch.txt, HIVE-11752.2.patch.txt
>
>
> Currently, hive regards CTE clauses as a simple alias to the query block, 
> which makes redundant works if it's used multiple times in a query. This 
> introduces a reference threshold for pre-materializing the CTE clause as a 
> volatile table (which is not exists in any form of metastore and just 
> accessible from QB).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)