question about hive hbase handler

2017-11-10 Thread Lionel CL
Hi dev,
I need to do some development in hive hbase handler. But for now I can only 
package first and run hql in hive cli or beeline.
Is there any guidance about how to run hql in IDEA and configure connection to 
the local hbase so that I can dubug the hive-hbase handler step by step?

Thanks & Regards,
Lionel




adding a label that would trigger HiveQA?

2017-11-10 Thread Sergey Shelukhin
Resubmitting the same patch for HiveQA as patches are constantly getting
dropped is getting old.
I wonder if we should have a label that would trigger HiveQA and only be
removed at the end, when posting results to the JIRA?
We could either add it in addition to the current filter or trigger
mechanism or, if JIRA allows that, make it the only selection criteria and
add it automatically when clicking submit patch (+remove it on cancel
patch and when HiveQA has finished).

Comments/suggestions?
I can try to modify the scripts some day.



[jira] [Created] (HIVE-18046) Metastore: default IS_REWRITE_ENABLED=false instead of NULL

2017-11-10 Thread Gopal V (JIRA)
Gopal V created HIVE-18046:
--

 Summary: Metastore: default IS_REWRITE_ENABLED=false instead of 
NULL
 Key: HIVE-18046
 URL: https://issues.apache.org/jira/browse/HIVE-18046
 Project: Hive
  Issue Type: Bug
  Components: Materialized views, Metastore
Affects Versions: 3.0.0
Reporter: Gopal V
Priority: Minor


The materialized view impl breaks old metastore sql write access, by 
complaining that the new table creation does not set this column up.

{code}
  `IS_REWRITE_ENABLED` bit(1) NOT NULL,
{code}

{{NOT NULL DEFAULT 0}} would allow old metastore directsql compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18045) can VectorizedOrcAcidRowBatchReader be used all the time

2017-11-10 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-18045:
-

 Summary: can VectorizedOrcAcidRowBatchReader be used all the time
 Key: HIVE-18045
 URL: https://issues.apache.org/jira/browse/HIVE-18045
 Project: Hive
  Issue Type: Improvement
  Components: Transactions
Reporter: Eugene Koifman
Assignee: Eugene Koifman


Can we use VectorizedOrcAcidRowBatchReader for non-vectorized queries?
It would just need a wrapper on top of it to turn VRBs into rows.
This would mean there is just 1 acid reader to maintain - not 2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


write access to the hive wiki

2017-11-10 Thread Slim Bouguerra


-- 
Code Climb Ski Repeat ….
B-Slim
___/\/\/\___/\/\/\___/\/\/\___/\/\/\___/\/\/\___



Re: Review Request 63442: HIVE-17934 Merging Statistics are promoted to COMPLETE (most of the time)

2017-11-10 Thread Ashutosh Chauhan


> On Nov. 9, 2017, 7:51 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/llap/auto_sortmerge_join_12.q.out
> > Line 160 (original), 160 (patched)
> > 
> >
> > bucket_small has no stats gathered. This should be NONE.
> 
> Zoltan Haindrich wrote:
> `hive.stats.autogather` is enabled by default from `HiveConf`

Those are load statements, not inserts. We don't gather stats with load 
statements only with insets.


> On Nov. 9, 2017, 7:51 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/llap/auto_sortmerge_join_12.q.out
> > Line 238 (original), 238 (patched)
> > 
> >
> > bucket_medium has no stats gathered. This should be NONE.
> 
> Zoltan Haindrich wrote:
> `hive.stats.autogather` is enabled by default from `HiveConf`

Those are load statements, not inserts. We don't gather stats with load 
statements.


> On Nov. 9, 2017, 7:51 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/llap/auto_sortmerge_join_12.q.out
> > Line 316 (original), 316 (patched)
> > 
> >
> > bucket_big has no stats gathered. This should be NONE.
> 
> Zoltan Haindrich wrote:
> `hive.stats.autogather` is enabled by default from `HiveConf`

Those are load statements, not inserts. We don't gather stats with load 
statements.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63442/#review190633
---


On Nov. 9, 2017, 5:39 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63442/
> ---
> 
> (Updated Nov. 9, 2017, 5:39 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-17934
> https://issues.apache.org/jira/browse/HIVE-17934
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> * remove the reactive stat state guessing method
> * make the guessing only work when a new object is created
> * change the way stat objects are merged
> 
> this patch will most probably break almost all qtest outputs
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> b3adf4e504 
>   hbase-handler/src/test/results/positive/hbase_queries.q.out b2eda12e95 
>   hbase-handler/src/test/results/positive/hbasestats.q.out 29eefd43a9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
>  7a3fae65e8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
>  a4f60accce 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/Statistics.java 8ffb4ce44b 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ce7c96c639 
>   ql/src/test/queries/clientpositive/lateral_view_onview2.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/stats_empty_partition2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/acid_table_stats.q.out 351ff0da0a 
>   ql/src/test/results/clientpositive/alterColumnStatsPart.q.out 858e16fe22 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out 3a94a6a4e3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 7875e9693a 
>   ql/src/test/results/clientpositive/cbo_const.q.out e9f885b363 
>   ql/src/test/results/clientpositive/cbo_input26.q.out 77fc194829 
>   ql/src/test/results/clientpositive/columnstats_partlvl_dp.q.out 414b715b7a 
>   ql/src/test/results/clientpositive/columnstats_quoting.q.out 683c1e274f 
>   ql/src/test/results/clientpositive/columnstats_tbllvl.q.out a2c6ead293 
>   ql/src/test/results/clientpositive/constGby.q.out c633624935 
>   ql/src/test/results/clientpositive/constant_prop_3.q.out cba4744866 
>   ql/src/test/results/clientpositive/constprog3.q.out f54168d0ee 
>   ql/src/test/results/clientpositive/correlationoptimizer10.q.out a03acd38a7 
>   ql/src/test/results/clientpositive/correlationoptimizer11.q.out cf2250790a 
>   ql/src/test/results/clientpositive/correlationoptimizer13.q.out 6d4f931213 
>   ql/src/test/results/clientpositive/correlationoptimizer14.q.out 149f33fee8 
>   ql/src/test/results/clientpositive/correlationoptimizer15.q.out 2d813b239f 
>   ql/src/test/results/clientpositive/correlationoptimizer5.q.out 68d6a54862 
>   ql/src/test/results/clientpositive/correlationoptimizer7.q.out 82fecab594 
>   ql/src/test/results/clientpositive/correlationoptimizer8.q.out f3cb988a03 
>   ql/src/test/results/clientpositive/correlationoptimizer9.q.out 5372408d2a 
>   ql/src/test/results/clientpositive/cte_mat_5.q.out 3747cec891 
>   

[jira] [Created] (HIVE-18044) CompactorMR.CompactorOutputCommitter.abortTask() not implemented

2017-11-10 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-18044:
-

 Summary: CompactorMR.CompactorOutputCommitter.abortTask() not 
implemented
 Key: HIVE-18044
 URL: https://issues.apache.org/jira/browse/HIVE-18044
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Reporter: Eugene Koifman


Can it explain the following?
{noformat}
Exception running child : org.apache.hadoop.fs.FileAlreadyExistsException: 
/apps/hiv/workmanagement.db/serviceorder_longtext/_tmp_40a7286b-da40-4624-baf3-4de12ec421f4/base_22699743/bucket_6
 for client 10.1.71.22 already exists 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2784)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2671)
 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2555)
 
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:735)
 
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:408)
 
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
 
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
{noformat}

and from yarn app log
{noformat}
2017-11-01 15:44:20,201 FATAL [IPC Server handler 3 on 42141] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1509391924057_1453_m_02_1 - exited : 
org.apache.hadoop.fs.FileAlreadyExistsException: 
/apps/hive/warehouse/workmanagement.db/serviceorder_longtext/_tmp_e95a96e2-e605-47d9-b878-bb662cd9ece2/base_22490990/bucket_7
 for client 10.│
│   at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2784)



 │
│   at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2671)



  │
│   at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2555)



 │
│   at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:735)



   │
│   at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:408)


  │
│   at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)


 │
│   at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)



│
│   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)  



 │
│   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) 
 

[GitHub] hive pull request #7: Update ql/src/java/org/apache/hadoop/hive/ql/exec/Util...

2017-11-10 Thread tecbot
Github user tecbot closed the pull request at:

https://github.com/apache/hive/pull/7


---


[jira] [Created] (HIVE-18043) Vectorization: Support List type in MapWork

2017-11-10 Thread Colin Ma (JIRA)
Colin Ma created HIVE-18043:
---

 Summary: Vectorization: Support List type in MapWork
 Key: HIVE-18043
 URL: https://issues.apache.org/jira/browse/HIVE-18043
 Project: Hive
  Issue Type: Improvement
Reporter: Colin Ma
Assignee: Colin Ma


Support Complex Types in vectorization is finished in HIVE-16589, but List type 
is still not support in MapWork. It should be supported to improve the 
performance when vectorization is enable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18042) Correlation Optimizer NPE when there is multi union all operation after join

2017-11-10 Thread Hengyu Dai (JIRA)
Hengyu Dai created HIVE-18042:
-

 Summary: Correlation Optimizer NPE when there is multi union all 
operation after join 
 Key: HIVE-18042
 URL: https://issues.apache.org/jira/browse/HIVE-18042
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 2.1.1
 Environment: 


Reporter: Hengyu Dai


test sql:

{code:sql}
SELECT DISTINCT a.logday AS push_day, a.mtype, a.t, If(b.msgid IS NULL, 'no', 
'yes') AS isnotdaoda, a.platform
, a.uid, a.dt
FROM (SELECT DISTINCT If(tokentype = '7', msgid, If(tokentype = '6', 
regexp_extract(sendpushresult, 'msgId":"([^"]+)', 1), 
regexp_extract(sendpushresult, 'msgId=(.+?),', 1))) AS msgid, logday, If(vid 
LIKE '60%', 'adr', If(vid LIKE '8%', 'ios', 'other')) AS platform, mtype, t
, If(vid LIKE '8%', uid, gid) AS uid, concat(substr(logday, 1, 4), '-', 
substr(logday, 5, 2), '-', substr(logday, 7, 2)) AS dt
FROM wirelessdata.orig_push_client
) a
LEFT JOIN (SELECT DISTINCT msgid
FROM (
SELECT DISTINCT msgid
FROM wirelessdata.orig_push_return
UNION ALL
SELECT DISTINCT msgid
FROM wirelessdata.orig_push_return_xiaomi
UNION ALL
SELECT DISTINCT regexp_extract(action, '"id":"([^"]+)', 1) AS msgid
FROM wirelessdata.ods_client_behavior_hour4spark
) bb
) b ON lower(a.msgid) = lower(b.msgid)
{code}

the error stack
{code:java}
2017-11-10T16:01:21,123 ERROR [9b7d82f5-dfc8-43ac-8d6f-a019d8677392 main] 
ql.Driver: FAILED: NullPointerException null
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setUnionPlan(GenMapRedUtils.java:230)
at 
org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.joinUnionPlan(GenMapRedUtils.java:287)
at 
org.apache.hadoop.hive.ql.optimizer.GenMRRedSink3.process(GenMRRedSink3.java:100)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
at 
org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:54)
at 
org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:65)
at 
org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:65)
at 
org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:65)
at 
org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:65)
at 
org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:65)
at 
org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:65)
at 
org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:65)
at 
org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:65)
at 
org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:65)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
at 
org.apache.hadoop.hive.ql.parse.MapReduceCompiler.generateTaskTree(MapReduceCompiler.java:323)
at 
org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:267)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11008)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10547)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:483)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1254)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1396)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1181)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:229)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:180)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:396)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:770)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:711)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:638)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
{code}



--
This message was sent by Atlassian