[jira] [Created] (HIVE-25833) Inconsistent date type behavior between hive2 and hive3 for ORC files

2021-12-22 Thread Nemon Lou (Jira)
Nemon Lou created HIVE-25833:


 Summary: Inconsistent date type behavior between hive2 and hive3 
for ORC files 
 Key: HIVE-25833
 URL: https://issues.apache.org/jira/browse/HIVE-25833
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.2
Reporter: Nemon Lou


In hive2 :

create table hive2_orc(id date);

insert into hive2_orc values('0001-01-01');

select * from hive2_orc;

--will get '0001-01-01'

in hive3:

query the same orc file,

--will get '0001-12-30'

 

The same thing happens between hive3 and master branch.

In hive3 writes '0001-01-01' and will get '0001-01-03' for master branch



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25671) Hybrid Grace Hash Join NullPointer When query RCFile

2021-11-04 Thread Nemon Lou (Jira)
Nemon Lou created HIVE-25671:


 Summary: Hybrid Grace Hash Join NullPointer When query RCFile
 Key: HIVE-25671
 URL: https://issues.apache.org/jira/browse/HIVE-25671
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.2
Reporter: Nemon Lou



{format}
2021-11-04 10:02:47,553 [INFO] [TezChild] |exec.MapJoinOperator|: Hybrid Grace 
Hash Join: Deserializing spilled hash partition...
2021-11-04 10:02:47,553 [INFO] [TezChild] |exec.MapJoinOperator|: Hybrid Grace 
Hash Join: Number of rows in hashmap: 1
2021-11-04 10:02:47,554 [INFO] [TezChild] |exec.MapJoinOperator|: Hybrid Grace 
Hash Join: Going to process spilled big table rows in partition 5. Number of 
rows: 1
2021-11-04 10:02:47,561 [ERROR] [TezChild] |exec.MapJoinOperator|: Unexpected 
exception from MapJoinOperator : null
java.lang.NullPointerException
at 
org.apache.hadoop.hive.serde2.columnar.ColumnarStructBase$FieldInfo.uncheckedGetField(ColumnarStructBase.java:114)
at 
org.apache.hadoop.hive.serde2.columnar.ColumnarStructBase.getField(ColumnarStructBase.java:172)
at 
org.apache.hadoop.hive.serde2.objectinspector.ColumnarStructObjectInspector.getStructFieldData(ColumnarStructObjectInspector.java:67)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:95)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$GetAdaptor.setFromRow(MapJoinBytesTableContainer.java:552)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.setMapJoinKey(MapJoinOperator.java:415)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:466)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.reProcessBigTable(MapJoinOperator.java:755)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.continueProcess(MapJoinOperator.java:671)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:604)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:733)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:757)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:477)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:284)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{format}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24902) Incorrect result due to ReduceExpressionsRule

2021-03-17 Thread Nemon Lou (Jira)
Nemon Lou created HIVE-24902:


 Summary: Incorrect result due to ReduceExpressionsRule
 Key: HIVE-24902
 URL: https://issues.apache.org/jira/browse/HIVE-24902
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 3.1.2, 4.0.0
Reporter: Nemon Lou


The following sql returns only one record (20210308)but we expect two(20210308
20210309).

{code:sql}
select * from (
select 
case when b.a=1
   then  
cast 
(from_unixtime(unix_timestamp(cast(20210309 as string),'MMdd') - 
86400,'MMdd') as bigint)
  else 
  20210309 
   end 
as col
from 
(select stack(2,1,2) as (a))
 as b
) t 
where t.col is not null;
{code}

After debuging, i find the ReduceExpressionsRule changes expression in the 
wrong way.
Original expression:

{code:sql}
IS NOT NULL(CASE(=($0, 1), 
CAST(FROM_UNIXTIME(-(UNIX_TIMESTAMP(CAST(_UTF-16LE'20210309'):VARCHAR(2147483647)
 CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary", 
_UTF-16LE'MMdd'), CAST(86400):BIGINT), _UTF-16LE'MMdd')):BIGINT, 
20210309))
{code}

After reducing expressions:
{code:sql}
CASE(=($0, 1), IS NOT 
NULL(CAST(FROM_UNIXTIME(-(UNIX_TIMESTAMP(CAST(_UTF-16LE'20210309'):VARCHAR(2147483647)
 CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary", 
_UTF-16LE'MMdd'), CAST(86400):BIGINT), _UTF-16LE'MMdd')):BIGINT), true)
{code}

The query plan in main branch:
{code:sql}
STAGE DEPENDENCIES:
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-0
Fetch Operator
  limit: -1
  Processor Tree:
TableScan
  alias: _dummy_table
  Row Limit Per Split: 1
  Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE Column 
stats: COMPLETE
  Select Operator
expressions: 2 (type: int), 1 (type: int), 2 (type: int)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 1 Data size: 12 Basic stats: COMPLETE Column 
stats: COMPLETE
UDTF Operator
  Statistics: Num rows: 1 Data size: 12 Basic stats: COMPLETE 
Column stats: COMPLETE
  function name: stack
  Filter Operator
predicate: COALESCE((col0 = 1),false) (type: boolean)
Statistics: Num rows: 1 Data size: 12 Basic stats: COMPLETE 
Column stats: COMPLETE
Select Operator
  expressions: CASE WHEN ((col0 = 1)) THEN (20210308L) ELSE 
(20210309L) END (type: bigint)
  outputColumnNames: _col0
  Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
Column stats: COMPLETE
  ListSink

Time taken: 0.155 seconds, Fetched: 28 row(s)

{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24579) Incorrect Result For Groupby With Limit

2021-01-03 Thread Nemon Lou (Jira)
Nemon Lou created HIVE-24579:


 Summary: Incorrect Result For Groupby With Limit
 Key: HIVE-24579
 URL: https://issues.apache.org/jira/browse/HIVE-24579
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.2, 2.3.7, 4.0.0
Reporter: Nemon Lou




{code:sql}
create table test(id int);
explain extended select id,count(*) from test group by id limit 10;
{code}

There is an TopN unexpectly for map phase, which casues incorrect result.


{code:sql}
STAGE PLANS:
 Stage: Stage-1
 Map Reduce
 Map Operator Tree:
 TableScan
 alias: test
 Statistics: Num rows: 337 Data size: 1350 Basic stats: COMPLETE Column stats: 
NONE
 GatherStats: false
 Select Operator
 expressions: id (type: int)
 outputColumnNames: id
 Statistics: Num rows: 337 Data size: 1350 Basic stats: COMPLETE Column stats: 
NONE
 Group By Operator
 aggregations: count()
 keys: id (type: int)
 mode: hash
 outputColumnNames: _col0, _col1
 Statistics: Num rows: 337 Data size: 1350 Basic stats: COMPLETE Column stats: 
NONE
 Reduce Output Operator
 key expressions: _col0 (type: int)
 null sort order: a
 sort order: +
 Map-reduce partition columns: _col0 (type: int)
 Statistics: Num rows: 337 Data size: 1350 Basic stats: COMPLETE Column stats: 
NONE
 tag: -1
 TopN: 10
 TopN Hash Memory Usage: 0.1
 value expressions: _col1 (type: bigint)
 auto parallelism: false
 Path -> Alias:
 file:/user/hive/warehouse/test [test]
 Path -> Partition:
 file:/user/hive/warehouse/test 
 Partition
 base file name: test
 input format: org.apache.hadoop.mapred.TextInputFormat
 output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 properties:
 COLUMN_STATS_ACCURATE \{"BASIC_STATS":"true"}
 bucket_count -1
 column.name.delimiter ,
 columns id
 columns.comments 
 columns.types int
 file.inputformat org.apache.hadoop.mapred.TextInputFormat
 file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 location file:/user/hive/warehouse/test
 name default.test
 numFiles 0
 numRows 0
 rawDataSize 0
 serialization.ddl struct test \{ i32 id}
 serialization.format 1
 serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 totalSize 0
 transient_lastDdlTime 1609730036
 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 
 input format: org.apache.hadoop.mapred.TextInputFormat
 output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 properties:
 COLUMN_STATS_ACCURATE \{"BASIC_STATS":"true"}
 bucket_count -1
 column.name.delimiter ,
 columns id
 columns.comments 
 columns.types int
 file.inputformat org.apache.hadoop.mapred.TextInputFormat
 file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 location file:/user/hive/warehouse/test
 name default.test
 numFiles 0
 numRows 0
 rawDataSize 0
 serialization.ddl struct test \{ i32 id}
 serialization.format 1
 serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 totalSize 0
 transient_lastDdlTime 1609730036
 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 name: default.test
 name: default.test
 Truncated Path -> Alias:
 /test [test]
 Needs Tagging: false
 Reduce Operator Tree:
 Group By Operator
 aggregations: count(VALUE._col0)
 keys: KEY._col0 (type: int)
 mode: mergepartial
 outputColumnNames: _col0, _col1
 Statistics: Num rows: 168 Data size: 672 Basic stats: COMPLETE Column stats: 
NONE
 Limit
 Number of rows: 10
 Statistics: Num rows: 10 Data size: 40 Basic stats: COMPLETE Column stats: NONE
 File Output Operator
 compressed: false
 GlobalTableId: 0
 directory: 
file:/tmp/root/bd08973b-b58c-4185-9072-c1891f67878d/hive_2021-01-04_11-14-01_745_4475755683092435506-1/-mr-10001/.hive-staging_hive_2021-01-04_11-14-01_745_4475755683092435506-1/-ext-10002
 NumFilesPerFileSink: 1
 Statistics: Num rows: 10 Data size: 40 Basic stats: COMPLETE Column stats: NONE
 Stats Publishing Key Prefix: 
file:/tmp/root/bd08973b-b58c-4185-9072-c1891f67878d/hive_2021-01-04_11-14-01_745_4475755683092435506-1/-mr-10001/.hive-staging_hive_2021-01-04_11-14-01_745_4475755683092435506-1/-ext-10002/
 table:
 input format: org.apache.hadoop.mapred.SequenceFileInputFormat
 output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
 properties:
 columns _col0,_col1
 columns.types int:bigint
 escape.delim \
 hive.serialization.extend.additional.nesting.levels true
 serialization.escape.crlf true
 serialization.format 1
 serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 TotalFiles: 1
 GatherStats: false
 MultiFileSpray: false

Stage: Stage-0
 Fetch Operator
 limit: 10
 Processor Tree:
 ListSink

Time taken: 1.877 seconds, Fetched: 128 row(s)
{code}






 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite

2020-09-14 Thread Nemon Lou (Jira)
Nemon Lou created HIVE-24165:


 Summary: CBO: Query fails after multiple count distinct rewrite 
 Key: HIVE-24165
 URL: https://issues.apache.org/jira/browse/HIVE-24165
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 4.0.0
Reporter: Nemon Lou


One way to reproduce:

```

drop table test;
CREATE TABLE test(
`device_id` string, 
`level` string, 
`site_id` string, 
`user_id` string, 
`first_date` string, 
`last_date` string,
`dt` string) ;
set hive.execution.engine=tez;
set hive.optimize.distinct.rewrite=true;
set hive.cli.print.header=true;
select 
dt,
site_id,
count(DISTINCT t1.device_id) as device_tol_cnt,
count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else null 
end) as device_add_cnt 
 from test t1 where dt='2020-09-15' 
 group by
 dt,
 site_id
;

```

 

Error log:

```

Exception in thread "main" java.lang.AssertionError: Cannot add expression of 
different type to set:Exception in thread "main" java.lang.AssertionError: 
Cannot add expression of different type to set:set type is 
RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT 
NULLexpression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT $f3_0) 
NOT NULLset is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group=\{2, 
3},agg#0=count($0),agg#1=count($1))expression is HiveProject#95 at 
org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411) 
at org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57) at 
org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234) at 
org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186)
 at 
org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317)
 at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) at 
org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) at 
org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280) 
at 
org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
 at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) 
at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273)
 at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002)
 at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709)
 at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609)
 at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
 at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) at 
org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414)
 at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430)
 at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450)
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164)
 at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330)
 at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659) at 
org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826) at 
org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773) at 
org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768) at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
 at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214) at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at 
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) at 
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) at 
org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at 
org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 

[jira] [Created] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote

2017-06-15 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-16907:


 Summary:  "INSERT INTO"  overwrite old data when destination table 
encapsulated by backquote 
 Key: HIVE-16907
 URL: https://issues.apache.org/jira/browse/HIVE-16907
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 2.1.1, 1.1.0
Reporter: Nemon Lou


A way to reproduce:
{noformat}
create database tdb;
use tdb;
create table t1(id int);
create table t2(id int);
explain insert into `tdb.t1` select * from t2;
{noformat}
{noformat}
+---+
|  Explain  
|
+---+
| STAGE DEPENDENCIES:   
|
|   Stage-1 is a root stage 
|
|   Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, Stage-4  
|
|   Stage-3 
|
|   Stage-0 depends on stages: Stage-3, Stage-2, Stage-5
|
|   Stage-2 
|
|   Stage-4 
|
|   Stage-5 depends on stages: Stage-4  
|
|   
|
| STAGE PLANS:  
|
|   Stage: Stage-1  
|
| Map Reduce
|
|   Map Operator Tree:  
|
|   TableScan   
|
| alias: t2 
|
| Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
stats: NONE |
| Select Operator   
|
|   expressions: id (type: int) 
|
|   outputColumnNames: _col0
|
|   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
stats: NONE   |
|   File Output Operator
|
| compressed: false 
|
| Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
stats: NONE |
| table:
|
| input format: 
org.apache.hadoop.hive.ql.io.RCFileInputFormat  
|
| output format: 
org.apache.hadoop.hive.ql.io.RCFileOutputFormat 
   |
|

[jira] [Created] (HIVE-16839) Unbalanced calls to openTransaction/commitTransaction when alter the same partition concurrently

2017-06-06 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-16839:


 Summary: Unbalanced calls to openTransaction/commitTransaction 
when alter the same partition concurrently
 Key: HIVE-16839
 URL: https://issues.apache.org/jira/browse/HIVE-16839
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Nemon Lou


SQL to reproduce:
prepare:
{noformat}
 hdfs dfs -mkdir -p 
/hzsrc/external/writing_dc/ltgsm/16e7a9b2-21a1-3f4f-8061-bc3395281627
 1,create external table tb_ltgsm_external (id int) PARTITIONED by (cp 
string,ld string);
{nofrmat}
open one beeline run these two sql many times 
{noformat} 2,ALTER TABLE tb_ltgsm_external ADD IF NOT EXISTS PARTITION 
(cp=2017060513,ld=2017060610);
 3,ALTER TABLE tb_ltgsm_external PARTITION (cp=2017060513,ld=2017060610) SET 
LOCATION 
'hdfs://hacluster/hzsrc/external/writing_dc/ltgsm/16e7a9b2-21a1-3f4f-8061-bc3395281627';
{noformat}
open another beeline to run this sql many times at the same time.
{noformat}
 4,ALTER TABLE tb_ltgsm_external DROP PARTITION (cp=2017060513,ld=2017060610);
{noformat}

MetaStore logs:
{noformat}
2017-06-06 21:58:34,213 | ERROR | pool-6-thread-197 | Retrying HMSHandler after 
2000 ms (attempt 1 of 10) with error: javax.jdo.JDOObjectNotFoundException: No 
such database row
FailedObject:49[OID]org.apache.hadoop.hive.metastore.model.MStorageDescriptor
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:475)
at 
org.datanucleus.api.jdo.JDOAdapter.getApiExceptionForNucleusException(JDOAdapter.java:1158)
at 
org.datanucleus.state.JDOStateManager.isLoaded(JDOStateManager.java:3231)
at 
org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoGetcd(MStorageDescriptor.java)
at 
org.apache.hadoop.hive.metastore.model.MStorageDescriptor.getCD(MStorageDescriptor.java:184)
at 
org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1282)
at 
org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1299)
at 
org.apache.hadoop.hive.metastore.ObjectStore.convertToPart(ObjectStore.java:1680)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartition(ObjectStore.java:1586)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
at com.sun.proxy.$Proxy0.getPartition(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveAlterHandler.alterPartitions(HiveAlterHandler.java:538)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions(HiveMetaStore.java:3317)
at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
at com.sun.proxy.$Proxy12.alter_partitions(Unknown Source)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions.getResult(ThriftHiveMetastore.java:9963)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions.getResult(ThriftHiveMetastore.java:9947)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
NestedThrowablesStackTrace:
No such database row
org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row
at 
org.datanucleus.store.rdbms.request.FetchRequest.execute(FetchRequest.java:357)
at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.fetchObject(RDBMSPersistenceHandler.java:324)
at 
org.datanucleus.state.AbstractStateManager.loadFieldsFromDatastore(AbstractStateManager.java:1120)

[jira] [Created] (HIVE-15638) ArrayIndexOutOfBoundsException when output Columns for UDTF are pruned

2017-01-16 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-15638:


 Summary: ArrayIndexOutOfBoundsException when output Columns for 
UDTF are pruned 
 Key: HIVE-15638
 URL: https://issues.apache.org/jira/browse/HIVE-15638
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 2.1.0, 1.3.0
Reporter: Nemon Lou


{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row [Error getting row data with exception 
java.lang.ArrayIndexOutOfBoundsException: 151
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:183)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:202)
at 
org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:364)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:200)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:186)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.toErrorMessage(MapOperator.java:525)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:494)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:180)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:174)
 ]
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:499)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ArrayIndexOutOfBoundsException: 151
at 
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:416)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:878)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:149)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
... 9 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 151
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:183)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:202)
at 
org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
at 
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.populateCachedDistributionKeys(ReduceSinkOperator.java:443)
at 
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:350)
... 13 more
{noformat}

The way to reproduce :
DDL:
{noformat}
create table tb_a(data_dt string,key string,src string,data_id string,tag_id 
string, entity_src string);
create table tb_b(pos_tagging string,src string,data_id string);
create table tb_c(key string,start_time string,data_dt string);
insert into tb_a values('20160901','CPI','04','data_id','tag_id','entity_src');
insert into tb_b values('pos_tagging','04','data_id');
insert into tb_c values('data_id','start_time_','20160901');
create function hwrl as 'HotwordRelationUDTF' using jar 
'hdfs:///tmp/nemon/udf/hotword.jar';

{noformat}

UDF File :
{code}
import java.util.ArrayList;

import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import 

[jira] [Created] (HIVE-14662) Wrong Class Instance When Using Custom SERDE

2016-08-29 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-14662:


 Summary: Wrong Class Instance When Using Custom SERDE
 Key: HIVE-14662
 URL: https://issues.apache.org/jira/browse/HIVE-14662
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Nemon Lou
Assignee: Nemon Lou


Using  [SERDE for 
mongoDB|https://github.com/mongodb/mongo-hadoop/blob/master/hive/src/main/java/com/mongodb/hadoop/hive/BSONSerDe.java]
DDL
{noformat}
create external table mytable (ID STRING..) 
ROW FORMAT SERDE  'com.mongodb.hadoop.hive.BSONSerDe' 
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"ID":"_id",.. }')
STORED AS INPUTFORMAT 'com.mongodb.hadoop.mapred.BSONFileInputFormat'
OUTPUTFORMAT 'com.mongodb.hadoop.hive.output.HiveBSONFileOutputFormat'
LOCATION 'hdfs:///mypath'; 
{noformat}
Open beeline and run the following query ,and then open another beeline,run 
this again.Then fails.
{noformat}
add jar hdfs:///tmp/mongo-hadoop-hive-1.4.2_new.jar;
add jar hdfs:///tmp/mongo-java-driver-3.0.4.jar;
add jar hdfs:///tmp/mongo-hadoop-core-1.4.2_new.jar;
select * from mytable limit 1;
{noformat}

Error log :
{noformat}
2016-08-25 09:30:34,475 | WARN  | HiveServer2-Handler-Pool: Thread-11972 | 
Error fetching results:  | 
org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:1058)
org.apache.hive.service.cli.HiveSQLException: java.io.IOException: 
org.apache.hadoop.hive.serde2.SerDeException: class 
com.mongodb.hadoop.hive.BSONSerDerequires a BSONWritable object, notclass 
com.mongodb.hadoop.io.BSONWritable
at 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:366)
at 
org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:251)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:710)
at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at 
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy20.fetchResults(Unknown Source)
at 
org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:451)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:1049)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: org.apache.hadoop.hive.serde2.SerDeException: 
class com.mongodb.hadoop.hive.BSONSerDerequires a BSONWritable object, notclass 
com.mongodb.hadoop.io.BSONWritable
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1756)
at 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:361)
... 24 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: class 
com.mongodb.hadoop.hive.BSONSerDerequires a BSONWritable object, notclass 
com.mongodb.hadoop.io.BSONWritable
at com.mongodb.hadoop.hive.BSONSerDe.deserialize(BSONSerDe.java:196)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:488)
... 28 more
{noformat}

Note:must make sure the table is not 

[jira] [Created] (HIVE-14557) Nullpointer When both SkewJoin and Mapjoin Enabled

2016-08-17 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-14557:


 Summary: Nullpointer When both SkewJoin  and Mapjoin Enabled
 Key: HIVE-14557
 URL: https://issues.apache.org/jira/browse/HIVE-14557
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 2.1.0, 1.1.0
Reporter: Nemon Lou


The following sql failed with return code 2 on mr.
{noformat}
create table a(id int,id1 int);
create table b(id int,id1 int);
create table c(id int,id1 int);
set hive.optimize.skewjoin=true;
select a.id,b.id,c.id1 from a,b,c where a.id=b.id and a.id1=c.id1;
{noformat}
Error log as follows:
{noformat}
2016-08-17 21:13:42,081 INFO [main] 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
Id =0
  
Id =21
  
Id =28
  
Id =16
  
  <\Children>
  Id = 28 null<\Parent>
<\FS>
  <\Children>
  Id = 21 nullId = 33 
Id =33
  null
  <\Children>
  <\Parent>
<\HASHTABLEDUMMY><\Parent>
<\MAPJOIN>
  <\Children>
  Id = 0 null<\Parent>
<\TS>
  <\Children>
  <\Parent>
<\MAP>
2016-08-17 21:13:42,084 INFO [main] 
org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing operator TS[21]
2016-08-17 21:13:42,084 INFO [main] 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Initializing dummy operator
2016-08-17 21:13:42,086 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: 
DESERIALIZE_ERRORS:0, RECORDS_IN:0, 
2016-08-17 21:13:42,087 ERROR [main] 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Hit error while closing operators 
- failing tree
2016-08-17 21:13:42,088 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.lang.RuntimeException: Hive Runtime Error while 
closing operators
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:682)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
... 8 more

{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14390) Wrong Table alias when CBO is on

2016-07-30 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-14390:


 Summary: Wrong Table alias when CBO is on
 Key: HIVE-14390
 URL: https://issues.apache.org/jira/browse/HIVE-14390
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 1.2.1
Reporter: Nemon Lou
Priority: Minor


There are 5 web_sales references in query95 of tpcds ,with alias ws1-ws5.
But the query plan only has ws1 when CBO is on.
query95 :
{noformat}
SELECT count(distinct ws1.ws_order_number) as order_count,
   sum(ws1.ws_ext_ship_cost) as total_shipping_cost,
   sum(ws1.ws_net_profit) as total_net_profit
FROM web_sales ws1
JOIN customer_address ca ON (ws1.ws_ship_addr_sk = ca.ca_address_sk)
JOIN web_site s ON (ws1.ws_web_site_sk = s.web_site_sk)
JOIN date_dim d ON (ws1.ws_ship_date_sk = d.d_date_sk)
LEFT SEMI JOIN (SELECT ws2.ws_order_number as ws_order_number
   FROM web_sales ws2 JOIN web_sales ws3
   ON (ws2.ws_order_number = ws3.ws_order_number)
   WHERE ws2.ws_warehouse_sk <> ws3.ws_warehouse_sk
) ws_wh1
ON (ws1.ws_order_number = ws_wh1.ws_order_number)
LEFT SEMI JOIN (SELECT wr_order_number
   FROM web_returns wr
   JOIN (SELECT ws4.ws_order_number as 
ws_order_number
  FROM web_sales ws4 JOIN web_sales ws5
  ON (ws4.ws_order_number = 
ws5.ws_order_number)
 WHERE ws4.ws_warehouse_sk <> 
ws5.ws_warehouse_sk
) ws_wh2
   ON (wr.wr_order_number = 
ws_wh2.ws_order_number)) tmp1
ON (ws1.ws_order_number = tmp1.wr_order_number)
WHERE d.d_date between '2002-05-01' and '2002-06-30' and
   ca.ca_state = 'GA' and
   s.web_company_name = 'pri';
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14353) Performance degradation after Projection Pruning in CBO

2016-07-27 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-14353:


 Summary: Performance degradation  after Projection Pruning in CBO
 Key: HIVE-14353
 URL: https://issues.apache.org/jira/browse/HIVE-14353
 Project: Hive
  Issue Type: Bug
  Components: CBO, Logical Optimizer
Affects Versions: 1.2.1
Reporter: Nemon Lou


TPC-DS with factor 1024.
Hive on Spark. 
With and without projection prunning,time spent are quite different.
The way to disable projection prunning : disable HiveRelFieldTrimmer in code 
and compile a new jar.
||queries||CBO_no_projection_prune||CBO||
|q27|   160|251 | 
|q7 |   200|312 |
|q88|   701|1092|
|q68|   234|345 |
|q39|53|78  |
|q73|   160|228 |
|q31|   463|659 |
|q79|   242|343 |
|q46|   256|363 |
|q60|   271|382 |
|q66|   198|278 |
|q34|   155|217 |
|q19|   184|256 |
|q26|   154|214 |
|q56|   262|364 |
|q75|   942|1303|
|q71|   288|388 |
|q25|   329|442 |
|q52|   142|190 |
|q42|   142|189 |
|q3 |   139|185 |
|q98|   153|203 |
|q89|   187|248 |
|q58|   264|340 |
|q43|   127|162 |
|q32|   174|221 |
|q96|   156|197 |
|q70|   320|404 |
|q29|   499|629 |
|q18|   266|329 |
|q21|   76 |92  |
|q90|   139|165 |




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14143) RawDataSize of RCFile is zero after analyze

2016-06-30 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-14143:


 Summary: RawDataSize of RCFile is zero after analyze 
 Key: HIVE-14143
 URL: https://issues.apache.org/jira/browse/HIVE-14143
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 2.1.0, 1.2.1
Reporter: Nemon Lou
Assignee: Nemon Lou
Priority: Minor


After running the following analyze command ,rawDataSize becomes zero for 
rcfile tables.
{noformat}
 analyze table RCFILE_TABLE compute statistics ;
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13791) Fix failure Unit Test TestHiveSessionImpl.testLeakOperationHandle

2016-05-19 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-13791:


 Summary: Fix  failure Unit Test 
TestHiveSessionImpl.testLeakOperationHandle
 Key: HIVE-13791
 URL: https://issues.apache.org/jira/browse/HIVE-13791
 Project: Hive
  Issue Type: Test
  Components: Test
Affects Versions: 2.1.0
Reporter: Nemon Lou
Assignee: Nemon Lou
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13602) TPCH q16 return wrong result when CBO is on

2016-04-24 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-13602:


 Summary: TPCH q16 return wrong result when CBO is on
 Key: HIVE-13602
 URL: https://issues.apache.org/jira/browse/HIVE-13602
 Project: Hive
  Issue Type: Bug
  Components: CBO, Logical Optimizer
Affects Versions: 1.2.1
Reporter: Nemon Lou


Running tpch with factor 2, 
q16 returns 1,160 rows when CBO is on,
while returns 59,616 rows when CBO is off.
See attachment for detail .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13546) Patch for HIVE-12893 is broken in branch-1

2016-04-19 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-13546:


 Summary: Patch for HIVE-12893 is broken in branch-1 
 Key: HIVE-13546
 URL: https://issues.apache.org/jira/browse/HIVE-13546
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 1.3.0
Reporter: Nemon Lou


The following sql fails:
{noformat}
set hive.map.aggr=true;
set mapreduce.reduce.speculative=false;
set hive.auto.convert.join=true;
set hive.optimize.reducededuplication = false;
set hive.optimize.reducededuplication.min.reducer=1;
set hive.optimize.mapjoin.mapreduce=true;
set hive.stats.autogather=true;

set mapred.reduce.parallel.copies=30;
set mapred.job.shuffle.input.buffer.percent=0.5;
set mapred.job.reduce.input.buffer.percent=0.2;
set mapred.map.child.java.opts=-server -Xmx2800m 
-Djava.net.preferIPv4Stack=true;
set mapred.reduce.child.java.opts=-server -Xmx3800m 
-Djava.net.preferIPv4Stack=true;
set mapreduce.map.memory.mb=3072;
set mapreduce.reduce.memory.mb=4096;
set hive.enforce.bucketing=true;
set hive.enforce.sorting=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=10;
set hive.exec.max.dynamic.partitions=10;
set hive.exec.max.created.files=100;
set hive.exec.parallel=true;
set hive.exec.reducers.max=2000;
set hive.stats.autogather=true;
set hive.optimize.sort.dynamic.partition=true;

set mapred.job.reduce.input.buffer.percent=0.0;
set mapreduce.input.fileinputformat.split.minsizee=24000;
set mapreduce.input.fileinputformat.split.minsize.per.node=24000;
set mapreduce.input.fileinputformat.split.minsize.per.rack=24000;
set hive.optimize.sort.dynamic.partition=true;
use tpcds_bin_partitioned_orc_4;
insert overwrite table store_sales partition (ss_sold_date_sk)
select
ss.ss_sold_time_sk,
ss.ss_item_sk,
ss.ss_customer_sk,
ss.ss_cdemo_sk,
ss.ss_hdemo_sk,
ss.ss_addr_sk,
ss.ss_store_sk,
ss.ss_promo_sk,
ss.ss_ticket_number,
ss.ss_quantity,
ss.ss_wholesale_cost,
ss.ss_list_price,
ss.ss_sales_price,
ss.ss_ext_discount_amt,
ss.ss_ext_sales_price,
ss.ss_ext_wholesale_cost,
ss.ss_ext_list_price,
ss.ss_ext_tax,
ss.ss_coupon_amt,
ss.ss_net_paid,
ss.ss_net_paid_inc_tax,
ss.ss_net_profit,
ss.ss_sold_date_sk
  from tpcds_text_4.store_sales ss;
{noformat}

Error log is as follows
{noformat}
2016-04-19 15:15:35,252 FATAL [main] ExecReducer: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row (tag=0) 
{"key":{"reducesinkkey0":null},"value":{"_col0":null,"_col1":5588,"_col2":170300,"_col3":null,"_col4":756,"_col5":91384,"_col6":16,"_col7":null,"_col8":855582,"_col9":28,"_col10":null,"_col11":48.83,"_col12":null,"_col13":0.0,"_col14":null,"_col15":899.64,"_col16":null,"_col17":6.14,"_col18":0.0,"_col19":null,"_col20":null,"_col21":null,"_col22":null}}
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:180)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1732)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:174)
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:653)
at java.util.ArrayList.get(ArrayList.java:429)
at 
org.apache.hadoop.hive.common.FileUtils.makePartName(FileUtils.java:151)
at 
org.apache.hadoop.hive.common.FileUtils.makePartName(FileUtils.java:131)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynPartDirectory(FileSinkOperator.java:1003)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:919)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:713)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
... 7 more
{noformat}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13141) Hive on Spark over HBase should accept parameters starting with "zookeeper.znode"

2016-02-23 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-13141:


 Summary: Hive on Spark over HBase should accept parameters 
starting with "zookeeper.znode"
 Key: HIVE-13141
 URL: https://issues.apache.org/jira/browse/HIVE-13141
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0, 1.2.0
Reporter: Nemon Lou
Assignee: Nemon Lou
Priority: Minor


HBase related paramters has been added by HIVE-12708.
Following the same way,parameters starting with "zookeeper.znode" should be add 
too,which are also HBase related paramters .
Refering to http://blog.cloudera.com/blog/2013/10/what-are-hbase-znodes/

I have seen a failure with Hive on Spark over HBase  due to customize 
zookeeper.znode.parent.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12847) ORC file footer cache should be memory sensitive

2016-01-12 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12847:


 Summary: ORC file footer cache should be memory sensitive
 Key: HIVE-12847
 URL: https://issues.apache.org/jira/browse/HIVE-12847
 Project: Hive
  Issue Type: Improvement
  Components: File Formats, ORC
Affects Versions: 1.2.1
Reporter: Nemon Lou


The size based footer cache can not control memory usage properly.
Having seen a HiveServer2 hang due to ORC file footer cache taking up too much 
heap memory.
A simple query like "select * from orc_table limit 1" can make HiveServer2 hang.
The input table has about 1000 ORC files and each ORC file owns about 2500 
stripes.
{noformat}
 num #instances #bytes  class name
--
   1: 21465360125758432120  
org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics
   3: 122233301 8800797672  
org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics
   5:  89439001 6439608072  
org.apache.hadoop.hive.ql.io.orc.OrcProto$IntegerStatistics
   7:   2981300  262354400  
org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeInformation
   9:   2981300  143102400  
org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics
  12:   2983691   71608584  
org.apache.hadoop.hive.ql.io.orc.ReaderImpl$StripeInformationImpl
  15: 809297121752  
org.apache.hadoop.hive.ql.io.orc.OrcProto$Type
  17:1032825783792  
org.apache.hadoop.mapreduce.lib.input.FileSplit
  20: 516413305024  
org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit
  21: 516413305024  org.apache.hadoop.hive.ql.io.orc.OrcSplit
  31: 1 413152  
[Lorg.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit;  
 100:  1122  26928  org.apache.hadoop.hive.ql.io.orc.Metadata  
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12689) Support multiple spark sessions in one Hive Session

2015-12-15 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12689:


 Summary: Support multiple spark sessions in one Hive Session
 Key: HIVE-12689
 URL: https://issues.apache.org/jira/browse/HIVE-12689
 Project: Hive
  Issue Type: Improvement
  Components: Spark
Reporter: Nemon Lou


As discussed in HIVE-12538,in case of one Hive Connection been used 
concurrently,there should be more than one spark sessions for that connection.
{quote}
 A hive session may "own" more than one spark session in case of asynchronous 
queries. If a spark session is live (used to run a spark job), that spark 
session will not be used to run the next job. Therefore, whenever whenever a 
spark configuration change is detected in Hive session, we need to mark all the 
live Spark sessions as outdated. When we are getting a session from the pool 
and check if the flag is set, then we destroy it and get a new one. 
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12614) RESET command does not close spark session

2015-12-08 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12614:


 Summary: RESET command does not close spark session
 Key: HIVE-12614
 URL: https://issues.apache.org/jira/browse/HIVE-12614
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 1.3.0, 2.1.0
Reporter: Nemon Lou
Assignee: Nemon Lou
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12615) Do not start spark session when only explain

2015-12-08 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12615:


 Summary: Do not start spark session when only explain 
 Key: HIVE-12615
 URL: https://issues.apache.org/jira/browse/HIVE-12615
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.3.0, 2.1.0
Reporter: Nemon Lou


When using beeline -e "set hive.execution.engine=spark;explain select count(*) 
from sometable",it's very slow due to starting of spark session on yarn.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12616) NullPointerException when spark session is reused to run a mapjoin

2015-12-08 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12616:


 Summary: NullPointerException when spark session is reused to run 
a mapjoin
 Key: HIVE-12616
 URL: https://issues.apache.org/jira/browse/HIVE-12616
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 1.3.0
Reporter: Nemon Lou
Assignee: Xuefu Zhang


The way to reproduce:
{noformat}
set hive.execution.engine=spark;
create table if not exists test(id int);
create table if not exists test1(id int);
insert into test values(1);
insert into test1 values(1);
select max(a.id) from test a ,test1 b
where a.id = b.id;
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12538) After set spark related config, SparkSession never get reused

2015-11-27 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12538:


 Summary: After set spark related config, SparkSession never get 
reused
 Key: HIVE-12538
 URL: https://issues.apache.org/jira/browse/HIVE-12538
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 1.3.0
Reporter: Nemon Lou


Hive on Spark yarn-cluster mode.
After setting "set spark.yarn.queue=QueueA;" ,
run the query "select count(*) from test"  3 times and you will find  3 
different yarn applications.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12496) Open ServerTransport After MetaStore Initialization

2015-11-23 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12496:


 Summary: Open ServerTransport After MetaStore Initialization 
 Key: HIVE-12496
 URL: https://issues.apache.org/jira/browse/HIVE-12496
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 1.2.1
 Environment: Standalone MetaStore, cluster mode(multiple instances)
Reporter: Nemon Lou
Assignee: Nemon Lou
Priority: Minor


During HiveMetaStore starting,the following steps should be reordered:
1,Creation of TServerSocket
2,Creation of HMSHandler
3,Creation of TThreadPoolServer 

Step 2 involves some initialization work including :
{noformat}
  createDefaultDB();
  createDefaultRoles();
  addAdminUsers();
{noformat}

TServerSocket shall be created after these initialization work  to prevent 
unnecessary waiting from client side.And if there are errors during 
initialization (multiple metastores creating default DB at the same time can 
cause errors),clients shall not connect to this metastore as it will shuting 
down due to error.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12480) Hive Counters "RECORDS_OUT" is wrong when using union all

2015-11-20 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12480:


 Summary: Hive Counters "RECORDS_OUT"  is wrong when using union 
all 
 Key: HIVE-12480
 URL: https://issues.apache.org/jira/browse/HIVE-12480
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 1.2.1
Reporter: Nemon Lou
Priority: Minor


1,prepare
{noformat}
set hive.execution.engine=mr;
CREATE TABLE IF NOT EXISTS test(id INT);
insert into test values (1), (2);
{noformat}
2,the query that will return wrong counter
{noformat}
set hive.execution.engine=mr;
insert into test select * from test union all select * from test;
{noformat}
The counter "RECORDS_OUT_1_default.test" is expected as 4,but actually 8



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12464) Inconsistent behavior between MapReduce and Spark engine on bucketed mapjoin

2015-11-19 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12464:


 Summary: Inconsistent behavior between MapReduce and Spark engine 
on bucketed mapjoin
 Key: HIVE-12464
 URL: https://issues.apache.org/jira/browse/HIVE-12464
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Spark
Affects Versions: 1.2.1
Reporter: Nemon Lou


Steps to reproduce:
1,prepare the table and data
{noformat}
create table if not exists lxw_test(imei string,sndaid string,data_time string)
CLUSTERED BY(imei) SORTED BY(imei) INTO 10 BUCKETS;
create table if not exists lxw_test1(imei string,sndaid string,data_time string)
CLUSTERED BY(imei) SORTED BY(imei) INTO 5 BUCKETS;
set hive.enforce.bucketing = true;
set hive.enforce.sorting = true;
insert overwrite table lxw_test
values(1,1,1),(2,2,2),(3,3,3),(4,4,4),(5,5,5),(6,6,6),(7,7,7),(8,8,8),(9,9,9),(10,10,10);
insert overwrite table lxw_test1
values 
(1,1,1),(2,2,2),(3,3,3),(4,4,4),(5,5,5),(6,6,6),(7,7,7),(8,8,8),(9,9,9),(10,10,10);
set hive.enforce.bucketing;
insert into table lxw_test1 select * from lxw_test;
set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;
set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
{noformat}
2,the following sql will success :
{noformat}
set hive.execution.engine=mr;
select  count(1) 
from lxw_test1 a 
join lxw_test b 
on a.imei = b.imei ;
{noformat}
3,this one will fail :
{noformat}
set hive.execution.engine=spark;
select  count(1) 
from lxw_test1 a 
join lxw_test b 
on a.imei = b.imei ;
{noformat}
On spark,the query returns this error:
{noformat}
Error: Error while compiling statement: FAILED: SemanticException [Error 
10141]: Bucketed table metadata is not correct. Fix the metadata or don't use 
bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of 
buckets for table lxw_test1 is 5, whereas the number of files is 10 
(state=42000,code=10141)
{noformat}
After set hive.ignore.mapjoin.hint=false and use mapjoin hint,the MapReduce 
engine return the same error.
{noformat}
set hive.execution.engine=mr;
set hive.ignore.mapjoin.hint=false;
explain
select /*+ mapjoin(b) */ count(1) 
from lxw_test1 a 
join lxw_test b 
on a.imei = b.imei ;
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12432) Hive on Spark Counter "RECORDS_OUT" always be zero

2015-11-17 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12432:


 Summary: Hive on Spark Counter "RECORDS_OUT" always  be zero
 Key: HIVE-12432
 URL: https://issues.apache.org/jira/browse/HIVE-12432
 Project: Hive
  Issue Type: Bug
  Components: Spark, Statistics
Affects Versions: 1.2.1
Reporter: Nemon Lou
Assignee: Nemon Lou


A simple way to reproduce :
set hive.execution.engine=spark;
CREATE TABLE  test(id INT);
insert into test values (1) (2);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12382) return actual row count for JDBC executeUpdate

2015-11-10 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12382:


 Summary: return actual row count for JDBC executeUpdate
 Key: HIVE-12382
 URL: https://issues.apache.org/jira/browse/HIVE-12382
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Reporter: Nemon Lou
Assignee: Nemon Lou
Priority: Minor


when running sql like 'insert into/overwrite table',
user may want to know how many rows are inserted .
Return actual row count for HiveStatement.executeUpdate is useful in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12371) Adding a timeout connection parameter for JDBC

2015-11-09 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-12371:


 Summary: Adding a timeout connection parameter for JDBC
 Key: HIVE-12371
 URL: https://issues.apache.org/jira/browse/HIVE-12371
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Reporter: Nemon Lou
Assignee: Vaibhav Gumashta


There are some timeout settings from server side:
HIVE-4766
HIVE-6679
Adding a timeout connection parameter for JDBC is useful in some scenario:
1,beeline (which can not set timeout manually)
2,customize timeout for different connections (among hive or RDBs,which can not 
be done via DriverManager.setLoginTimeout())
Just like postgresql,
{noformat}
jdbc:postgresql://localhost/test?user=fred=secret=true=0
{noformat}
or mysql
{noformat}
jdbc:mysql://xxx.xx.xxx.xxx:3306/database?connectTimeout=6=6
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-09-08 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-11768:


 Summary: java.io.DeleteOnExitHook leaks memory on long running 
Hive Server2 Instances
 Key: HIVE-11768
 URL: https://issues.apache.org/jira/browse/HIVE-11768
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 1.2.1
Reporter: Nemon Lou


  More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
long running HiveServer2 instances,taken up more than 100MB on heap.
  Most of the paths contains a suffix of ".piepout".




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11244) Beeline prompt info improvement for cluster mode

2015-07-13 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-11244:


 Summary: Beeline prompt info improvement for cluster mode
 Key: HIVE-11244
 URL: https://issues.apache.org/jira/browse/HIVE-11244
 Project: Hive
  Issue Type: Improvement
  Components: Beeline
Reporter: Nemon Lou
Assignee: Nemon Lou
Priority: Minor


Currently Beeline prompt info for Cluster mode is like this:
{noformat}
0: jdbc:hive2://192.168.115.1:24002,192.168.1
{noformat}
Using the very HiveServer2's IP that this beeline connect to is more helpful 
for users.
Like this:
{noformat}
0: jdbc:hive2://192.168.115.1:24002
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11243) Changing log level in Utilities.getBaseWork

2015-07-13 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-11243:


 Summary: Changing log level in Utilities.getBaseWork
 Key: HIVE-11243
 URL: https://issues.apache.org/jira/browse/HIVE-11243
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 1.2.0
Reporter: Nemon Lou
Assignee: Nemon Lou
Priority: Minor


Seeing a lot this kind of log when running jobs without any reduce,changeing 
this log to debug level should be ok.
{noformat}
2015-07-10 15:13:52,910 | INFO  | HiveServer2-Background-Pool: Thread-6074 | 
File not found: File does not exist: 
/tmp/hive-scratch/admin/3f70dbe7-96c0-41be-baac-72f4a2e45ea0/hive_2015-07-10_15-13-40_991_7379130813954010484-5/-mr-10008/ef20bbe4-9311-4633-9057-e018ce08cc00/reduce.xml
 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1834)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1805)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1718)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:589)
 at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:367)
 at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:972)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2088)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1672)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2084)
 | org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:456)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10817) Blacklist For Bad MetaStore

2015-05-25 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-10817:


 Summary: Blacklist For Bad MetaStore
 Key: HIVE-10817
 URL: https://issues.apache.org/jira/browse/HIVE-10817
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, Metastore
Affects Versions: 1.2.0
Reporter: Nemon Lou
Assignee: Nemon Lou


During a reliability test ,when one of MetaStore 's machine power down 
,HiveServer2 then never submit jobs to YARN.
There are 100 JDBC clients (Beeline)  running concurrently.And all the 100 
JDBC clients hangs.
After checking HiveServer2's thread stack,i find that most of the threads 
waiting to lock AbstractService while the one holding it is trying to connect 
to 
the bad MetaStore which has been power down.When the thread which hold this 
lock finally return SocketTimeoutException and release this lock,another thread 
will hold this lock and again stuck until  socket time out.
Adding a new blacklist mechanism finally solved this issue. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10815) Let HiveMetaStoreClient Choose MetaStore Randomly

2015-05-25 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-10815:


 Summary: Let HiveMetaStoreClient Choose MetaStore Randomly
 Key: HIVE-10815
 URL: https://issues.apache.org/jira/browse/HIVE-10815
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, Metastore
Affects Versions: 1.2.0
Reporter: Nemon Lou
Assignee: Nemon Lou


Currently HiveMetaStoreClient using a fixed order to choose MetaStore URIs when 
multiple metastores configured.
 Choosing MetaStore Randomly will be good for load balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10781) HadoopJobExecHelper Leaks RunningJobs

2015-05-21 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-10781:


 Summary: HadoopJobExecHelper Leaks RunningJobs
 Key: HIVE-10781
 URL: https://issues.apache.org/jira/browse/HIVE-10781
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2
Affects Versions: 1.2.0, 0.13.1
Reporter: Nemon Lou


On one of our busy hadoop cluster, hiveServer2 holds more than 4000 
org.apache.hadoop.mapred.JobClient$NetworkedJob instances,while only has less 
than 3 backgroud handler thread at the same time.
All these instances are hold in one LinkedList from 
org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper's  runningJobs 
property,which is static.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10625) Handle Authorization for 'select expr' hive queries in SQL Standard Authorization

2015-05-06 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-10625:


 Summary: Handle Authorization for  'select expr' hive queries in 
 SQL Standard Authorization
 Key: HIVE-10625
 URL: https://issues.apache.org/jira/browse/HIVE-10625
 Project: Hive
  Issue Type: Bug
  Components: Authorization, SQLStandardAuthorization
Affects Versions: 1.1.0
Reporter: Nemon Lou


Hive internally rewrites this 'select expression' query into 'select 
expression from _dummy_database._dummy_table', where these dummy db and table 
are temp entities for the current query.
The SQL Standard Authorization  need to handle these special objects.

Typing select reverse(123); in beeline : 
,will get this error :
{code}
Error: Error while compiling statement: FAILED: HiveAuthzPluginException Error 
getting object from metastore for Object [type=TABLE_OR_VIEW, 
name=_dummy_database._dummy_table] (state=42000,code=4)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10417) Parallel Order By return wrong results for partitioned tables

2015-04-21 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-10417:


 Summary: Parallel Order By return wrong results for partitioned 
tables
 Key: HIVE-10417
 URL: https://issues.apache.org/jira/browse/HIVE-10417
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0, 0.13.1, 0.14.0
Reporter: Nemon Lou


Following is the script that reproduce this bug.
set hive.optimize.sampling.orderby=true;
set mapreduce.job.reduces=10;
select * from src order by key desc limit 10;
+--++
| src.key  | src.value  |
+--++
| 98   | val_98 |
| 98   | val_98 |
| 97   | val_97 |
| 97   | val_97 |
| 96   | val_96 |
| 95   | val_95 |
| 95   | val_95 |
| 92   | val_92 |
| 90   | val_90 |
| 90   | val_90 |
+--++
10 rows selected (47.916 seconds)
reset;
create table src_orc_p (key string ,value string )
partitioned by (kp string)
stored as orc
tblproperties(orc.compress=SNAPPY);
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=1;
set hive.exec.max.dynamic.partitions=1;
insert into table src_orc_p partition(kp) select *,substring(key,1) from src 
distribute by substring(key,1);
set mapreduce.job.reduces=10;
set hive.optimize.sampling.orderby=true;
select * from src_orc_p order by key desc limit 10;
++--+-+
| src_orc_p.key  | src_orc_p.value  | src_orc_p.kend  |
++--+-+
| 0  | val_0| 0   |
| 0  | val_0| 0   |
| 0  | val_0| 0   |
++--+-+
3 rows selected (39.861 seconds)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9839) HiveServer2 leaks OperationHandle on failed async queries

2015-03-03 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-9839:
---

 Summary: HiveServer2 leaks OperationHandle on failed async queries
 Key: HIVE-9839
 URL: https://issues.apache.org/jira/browse/HIVE-9839
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 1.0.0, 0.13.1, 0.14.0
Reporter: Nemon Lou


Using beeline to connect to HiveServer2.And type the following:
drop table if exists table_not_exists;
select * from table_not_exists;

There will be an OperationHandle object staying in HiveServer2's memory for 
ever even after quit from beeline .





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9100) HiveServer2 fail to connect to MetaStore after MetaStore restarting

2015-01-26 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291782#comment-14291782
 ] 

Nemon Lou commented on HIVE-9100:
-

Mariusz Strzelecki is right.After changing  metastore's TokenStore from memory 
to DB,the error disappears.Thanks, Mariusz Strzelecki.

 HiveServer2 fail to connect to MetaStore after MetaStore restarting 
 

 Key: HIVE-9100
 URL: https://issues.apache.org/jira/browse/HIVE-9100
 Project: Hive
  Issue Type: Bug
  Components: Authentication, HiveServer2, Security
Affects Versions: 0.14.0, 0.13.1
Reporter: Nemon Lou
 Attachments: hiveserver2.log, metastore.log


 Secure cluster with kerberos,remote metastore.
 How to reproduce :
 1,use beeline to connect to HiveServer2
 2,restart the MetaStore process
 3,type command like 'show tables' in beeline
 Client side will report this error:
 {quote}
 Error: Error while processing statement: FAILED: Execution Error, return code 
 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Could 
 not connect to meta store using any of the URIs provided. Most recent 
 failure: org.apache.thrift.transport.TTransportException: Peer indicated 
 failure: DIGEST-MD5: IO error acquiring password
   at 
 org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:190)
 {quote}
 HiveServer2's log and metastore's log are uploaded as attachments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7797) upgrade hive schema from 0.9.0 to 0.13.1 failed

2014-12-17 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251006#comment-14251006
 ] 

Nemon Lou commented on HIVE-7797:
-

review link : https://reviews.apache.org/r/29136/

 upgrade hive schema from 0.9.0 to 0.13.1 failed 
 

 Key: HIVE-7797
 URL: https://issues.apache.org/jira/browse/HIVE-7797
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Nemon Lou
 Attachments: HIVE-7797.1.patch


 Using hive schema tool with the following command to upgrade hive schema 
 failed:
 schematool -dbType postgres -upgradeSchemaFrom 0.9.0
 ERROR: null value in column SCHEMA_VERSION violates not-null constraint
 Log shows that the upgrade sql file 014-HIVE-3764.postgres.sql failed.
 The sql in it is :
 INSERT INTO VERSION (VER_ID, SCHEMA_VERSION, VERSION_COMMENT) VALUES 
 (1, '', 'Initial value');
 And the result is:
 ERROR: null value in column SCHEMA_VERSION violates not-null constraint
 DETAIL: Failing row contains (1, null, Initial value).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7797) upgrade hive schema from 0.9.0 to 0.13.1 failed

2014-12-16 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-7797:

Summary: upgrade hive schema from 0.9.0 to 0.13.1 failed   (was: upgrade 
sql 014-HIVE-3764.postgres.sql failed)

 upgrade hive schema from 0.9.0 to 0.13.1 failed 
 

 Key: HIVE-7797
 URL: https://issues.apache.org/jira/browse/HIVE-7797
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.1
Reporter: Nemon Lou

 The sql is :
 INSERT INTO VERSION (VER_ID, SCHEMA_VERSION, VERSION_COMMENT) VALUES 
 (1, '', 'Initial value');
 And the result is:
 ERROR:  null value in column SCHEMA_VERSION violates not-null constraint
 DETAIL:  Failing row contains (1, null, Initial value).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7797) upgrade hive schema from 0.9.0 to 0.13.1 failed

2014-12-16 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-7797:

Description: 
Using hive schema tool with the following command to upgrade hive schema failed:
schematool -dbType postgres -upgradeSchemaFrom 0.9.0

ERROR: null value in column SCHEMA_VERSION violates not-null constraint

  was:
The sql is :
INSERT INTO VERSION (VER_ID, SCHEMA_VERSION, VERSION_COMMENT) VALUES 
(1, '', 'Initial value');

And the result is:
ERROR:  null value in column SCHEMA_VERSION violates not-null constraint
DETAIL:  Failing row contains (1, null, Initial value).


 upgrade hive schema from 0.9.0 to 0.13.1 failed 
 

 Key: HIVE-7797
 URL: https://issues.apache.org/jira/browse/HIVE-7797
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.1
Reporter: Nemon Lou

 Using hive schema tool with the following command to upgrade hive schema 
 failed:
 schematool -dbType postgres -upgradeSchemaFrom 0.9.0
 ERROR: null value in column SCHEMA_VERSION violates not-null constraint



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7797) upgrade hive schema from 0.9.0 to 0.13.1 failed

2014-12-16 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-7797:

Affects Version/s: 0.14.0

 upgrade hive schema from 0.9.0 to 0.13.1 failed 
 

 Key: HIVE-7797
 URL: https://issues.apache.org/jira/browse/HIVE-7797
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Nemon Lou

 Using hive schema tool with the following command to upgrade hive schema 
 failed:
 schematool -dbType postgres -upgradeSchemaFrom 0.9.0
 ERROR: null value in column SCHEMA_VERSION violates not-null constraint
 Log shows that the upgrade sql file 014-HIVE-3764.postgres.sql failed.
 The sql in it is :
 INSERT INTO VERSION (VER_ID, SCHEMA_VERSION, VERSION_COMMENT) VALUES 
 (1, '', 'Initial value');
 And the result is:
 ERROR: null value in column SCHEMA_VERSION violates not-null constraint
 DETAIL: Failing row contains (1, null, Initial value).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7797) upgrade hive schema from 0.9.0 to 0.13.1 failed

2014-12-16 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-7797:

Description: 
Using hive schema tool with the following command to upgrade hive schema failed:
schematool -dbType postgres -upgradeSchemaFrom 0.9.0

ERROR: null value in column SCHEMA_VERSION violates not-null constraint

Log shows that the upgrade sql file 014-HIVE-3764.postgres.sql failed.
The sql in it is :
INSERT INTO VERSION (VER_ID, SCHEMA_VERSION, VERSION_COMMENT) VALUES 
(1, '', 'Initial value');
And the result is:
ERROR: null value in column SCHEMA_VERSION violates not-null constraint
DETAIL: Failing row contains (1, null, Initial value).

  was:
Using hive schema tool with the following command to upgrade hive schema failed:
schematool -dbType postgres -upgradeSchemaFrom 0.9.0

ERROR: null value in column SCHEMA_VERSION violates not-null constraint


 upgrade hive schema from 0.9.0 to 0.13.1 failed 
 

 Key: HIVE-7797
 URL: https://issues.apache.org/jira/browse/HIVE-7797
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Nemon Lou

 Using hive schema tool with the following command to upgrade hive schema 
 failed:
 schematool -dbType postgres -upgradeSchemaFrom 0.9.0
 ERROR: null value in column SCHEMA_VERSION violates not-null constraint
 Log shows that the upgrade sql file 014-HIVE-3764.postgres.sql failed.
 The sql in it is :
 INSERT INTO VERSION (VER_ID, SCHEMA_VERSION, VERSION_COMMENT) VALUES 
 (1, '', 'Initial value');
 And the result is:
 ERROR: null value in column SCHEMA_VERSION violates not-null constraint
 DETAIL: Failing row contains (1, null, Initial value).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7797) upgrade hive schema from 0.9.0 to 0.13.1 failed

2014-12-16 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-7797:

Attachment: HIVE-7797.1.patch

Using blank space instead of '' ,so postgres won't convert the empty string 
into null.

 upgrade hive schema from 0.9.0 to 0.13.1 failed 
 

 Key: HIVE-7797
 URL: https://issues.apache.org/jira/browse/HIVE-7797
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Nemon Lou
 Attachments: HIVE-7797.1.patch


 Using hive schema tool with the following command to upgrade hive schema 
 failed:
 schematool -dbType postgres -upgradeSchemaFrom 0.9.0
 ERROR: null value in column SCHEMA_VERSION violates not-null constraint
 Log shows that the upgrade sql file 014-HIVE-3764.postgres.sql failed.
 The sql in it is :
 INSERT INTO VERSION (VER_ID, SCHEMA_VERSION, VERSION_COMMENT) VALUES 
 (1, '', 'Initial value');
 And the result is:
 ERROR: null value in column SCHEMA_VERSION violates not-null constraint
 DETAIL: Failing row contains (1, null, Initial value).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9100) HiveServer2 fail to connect to MetaStore after MetaStore restarting

2014-12-15 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-9100:
---

 Summary: HiveServer2 fail to connect to MetaStore after MetaStore 
restarting 
 Key: HIVE-9100
 URL: https://issues.apache.org/jira/browse/HIVE-9100
 Project: Hive
  Issue Type: Bug
  Components: Authentication, HiveServer2, Security
Affects Versions: 0.13.1, 0.14.0
Reporter: Nemon Lou


Secure cluster with kerberos,remote metastore.
How to reproduce :
1,use beeline to connect to HiveServer2
2,restart the MetaStore process
3,type command like 'show tables' in beeline

Client side will report this error:
{quote}
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Could not 
connect to meta store using any of the URIs provided. Most recent failure: 
org.apache.thrift.transport.TTransportException: Peer indicated failure: 
DIGEST-MD5: IO error acquiring password
at 
org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:190)
{quote}

HiveServer2's log and metastore's log are uploaded as attachments.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9100) HiveServer2 fail to connect to MetaStore after MetaStore restarting

2014-12-15 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-9100:

Attachment: metastore.log
hiveserver2.log

 HiveServer2 fail to connect to MetaStore after MetaStore restarting 
 

 Key: HIVE-9100
 URL: https://issues.apache.org/jira/browse/HIVE-9100
 Project: Hive
  Issue Type: Bug
  Components: Authentication, HiveServer2, Security
Affects Versions: 0.14.0, 0.13.1
Reporter: Nemon Lou
 Attachments: hiveserver2.log, metastore.log


 Secure cluster with kerberos,remote metastore.
 How to reproduce :
 1,use beeline to connect to HiveServer2
 2,restart the MetaStore process
 3,type command like 'show tables' in beeline
 Client side will report this error:
 {quote}
 Error: Error while processing statement: FAILED: Execution Error, return code 
 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Could 
 not connect to meta store using any of the URIs provided. Most recent 
 failure: org.apache.thrift.transport.TTransportException: Peer indicated 
 failure: DIGEST-MD5: IO error acquiring password
   at 
 org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:190)
 {quote}
 HiveServer2's log and metastore's log are uploaded as attachments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9095) permanent functions' ClassLoader should be global instead of per-session

2014-12-12 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-9095:
---

 Summary: permanent functions' ClassLoader should be global instead 
of per-session
 Key: HIVE-9095
 URL: https://issues.apache.org/jira/browse/HIVE-9095
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, UDF
Affects Versions: 0.13.1, 0.14.0
Reporter: Nemon Lou


FunctionRegistry.mFunctions is static. That means that in HS2 case, all users 
will share the same UDF class object from  mFunctions ,which lead to share the 
same classloader that load this class. 

First,this will make the per-session classloader useless.Because only the first 
classLoader will be used to initailize the instances of the permanent UDF class.
Second, it's will cause class not found exception,when the classLoader created 
by the first session be closed before load all the classes that need.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9095) permanent functions' ClassLoader should be global instead of per-session

2014-12-12 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-9095:

Description: 
FunctionRegistry.mFunctions is static. That means that in HS2 case, all users 
will share the same UDF class object from  mFunctions ,which lead to share the 
same classloader that load this class. 

First,this will make the per-session classloader useless.Because only the first 
classLoader will be used to initailize the instances of the permanent UDF class.
Second, it's will cause class not found exception,when the classLoader created 
by the first session has been closed before load all the classes that need.

  was:
FunctionRegistry.mFunctions is static. That means that in HS2 case, all users 
will share the same UDF class object from  mFunctions ,which lead to share the 
same classloader that load this class. 

First,this will make the per-session classloader useless.Because only the first 
classLoader will be used to initailize the instances of the permanent UDF class.
Second, it's will cause class not found exception,when the classLoader created 
by the first session be closed before load all the classes that need.


 permanent functions' ClassLoader should be global instead of per-session
 

 Key: HIVE-9095
 URL: https://issues.apache.org/jira/browse/HIVE-9095
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, UDF
Affects Versions: 0.14.0, 0.13.1
Reporter: Nemon Lou

 FunctionRegistry.mFunctions is static. That means that in HS2 case, all users 
 will share the same UDF class object from  mFunctions ,which lead to share 
 the same classloader that load this class. 
 First,this will make the per-session classloader useless.Because only the 
 first classLoader will be used to initailize the instances of the permanent 
 UDF class.
 Second, it's will cause class not found exception,when the classLoader 
 created by the first session has been closed before load all the classes that 
 need.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7021) HiveServer2 memory leak on failed queries

2014-12-11 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243680#comment-14243680
 ] 

Nemon Lou commented on HIVE-7021:
-

Even without HIVE-4629,HiveServer2 can leak OperationHandle on failed queries.
When a JDBCClient runs queries like select * from table_not_exists,
HiveServer2 fail this query duiring compile,but leaves an OperationHandle in 
memory (due to  async mode) without pass it back to client side .
Shall I fire a new jira for this? Or would you fix it in this patch?

 HiveServer2 memory leak on failed queries
 -

 Key: HIVE-7021
 URL: https://issues.apache.org/jira/browse/HIVE-7021
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Attachments: HIVE-4629+HIVE-7021.1.patch, HIVE-7021.1.patch


 The number of the following objects keeps increasing if a query causes an 
 exception:
 org.apache.hive.service.cli.HandleIdentifier
 org.apache.hive.service.cli.OperationHandle
 org.apache.hive.service.cli.log.LinkedStringBuffer
 org.apache.hive.service.cli.log.OperationLog
 The leak can be observed using a JDBCClient that runs something like this
   connection = 
 DriverManager.getConnection(jdbc:hive2:// + hostname + :1/default, 
 , );
   statement   = connection.createStatement();
   statement.execute(CREATE TEMPORARY FUNCTION 
 dummy_function AS 'dummy.class.name');
 The above SQL will fail if HS2 cannot load dummy.class.name class. Each 
 iteration of such query will result in +1 increase in instance count for the 
 classes mentioned above.
 This will eventually cause OOM in the HS2 service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8418) Upgrade to Thrift 0.9.1

2014-10-09 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-8418:
---

 Summary: Upgrade to Thrift 0.9.1
 Key: HIVE-8418
 URL: https://issues.apache.org/jira/browse/HIVE-8418
 Project: Hive
  Issue Type: Task
  Components: Server Infrastructure
Affects Versions: 0.13.1
Reporter: Nemon Lou


THRIFT-1869 fixes a crash in HS2 when the thrift thread pool is consumed.
The patch has been included in Thrift 0.9.1 .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4224) Upgrade to Thrift 1.0 when available

2014-09-30 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152849#comment-14152849
 ] 

Nemon Lou commented on HIVE-4224:
-

THRIFT-1869  has been fixed in Thrift 0.9.1,which is released on 21/Aug/13.
Any plan to upgrade thrift to 0.9.1 ?

 Upgrade to Thrift 1.0 when available
 

 Key: HIVE-4224
 URL: https://issues.apache.org/jira/browse/HIVE-4224
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2, Metastore, Server Infrastructure
Affects Versions: 0.11.0
Reporter: Brock Noland
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-7797) upgrade sql 014-HIVE-3764.postgres.sql failed

2014-08-20 Thread Nemon Lou (JIRA)
Nemon Lou created HIVE-7797:
---

 Summary: upgrade sql 014-HIVE-3764.postgres.sql failed
 Key: HIVE-7797
 URL: https://issues.apache.org/jira/browse/HIVE-7797
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.1
Reporter: Nemon Lou


The sql is :
INSERT INTO VERSION (VER_ID, SCHEMA_VERSION, VERSION_COMMENT) VALUES 
(1, '', 'Initial value');

And the result is:
ERROR:  null value in column SCHEMA_VERSION violates not-null constraint
DETAIL:  Failing row contains (1, null, Initial value).



--
This message was sent by Atlassian JIRA
(v6.2#6252)