[jira] [Created] (HIVE-16525) Hive ORG format taking long to load

2017-04-24 Thread Christian (JIRA)
Christian created HIVE-16525:


 Summary: Hive ORG format taking long to load
 Key: HIVE-16525
 URL: https://issues.apache.org/jira/browse/HIVE-16525
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Affects Versions: 2.1.0
Reporter: Christian


We are loading data into hive tables it is very slow on ORG format which 
setting needs to change in order to improve the loading. Or what is the best 
way of loading data into HIVE Org using Abinitio. Shall we first load hdfs then 
hive org (This is still slow) I am thinking that we need to change a parameter 
in order to improve this load. Kindly advice Hive setting for ORG loading.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Review Request 58686: HIVE-16147: Rename a partitioned table should not drop its partition columns stats

2017-04-24 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58686/#review172890
---


Ship it!




Ship It!

- pengcheng xiong


On April 24, 2017, 10:29 p.m., Chaoyu Tang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58686/
> ---
> 
> (Updated April 24, 2017, 10:29 p.m.)
> 
> 
> Review request for hive and pengcheng xiong.
> 
> 
> Bugs: HIVE-16147
> https://issues.apache.org/jira/browse/HIVE-16147
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch is to
> 1. preserve the column stats after a partitioned table is renamed
> 2. rename the alter_table_invalidate_column_stats.q to 
> alter_table_column_stats.q
> 
> 
> Diffs
> -
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
> 77b3541 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 1b701e0 
>   ql/src/test/queries/clientpositive/alter_table_column_stats.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/alter_table_invalidate_column_stats.q 
> a478451 
>   ql/src/test/results/clientpositive/alter_table_column_stats.q.out 
> PRE-CREATION 
>   
> ql/src/test/results/clientpositive/alter_table_invalidate_column_stats.q.out 
> 85d7dc4 
> 
> 
> Diff: https://reviews.apache.org/r/58686/diff/1/
> 
> 
> Testing
> ---
> 
> manual tests
> qtests
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>



[jira] [Created] (HIVE-16524) Remove the redundant item type in hiveserver2.jsp and QueryProfileTmpl.jamon

2017-04-24 Thread ZhangBing Lin (JIRA)
ZhangBing Lin created HIVE-16524:


 Summary: Remove the redundant item type in hiveserver2.jsp and 
QueryProfileTmpl.jamon
 Key: HIVE-16524
 URL: https://issues.apache.org/jira/browse/HIVE-16524
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: ZhangBing Lin
Assignee: ZhangBing Lin
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16523) VectorHashKeyWrapper hash code for strings is not so good

2017-04-24 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-16523:
---

 Summary: VectorHashKeyWrapper hash code for strings is not so good
 Key: HIVE-16523
 URL: https://issues.apache.org/jira/browse/HIVE-16523
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


Perf issues in vectorized gby on some string keys



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: ptest not running?

2017-04-24 Thread Eugene Koifman
Something is not working with the part of the flow that triggers the build 
automatically.
In the meantime, if you log in with apache credentials, and go to “Build with 
Parameters” 
you can manually launch the test by entering the bug number: just the digits, 
skip the “HIVE-“ part


On 4/24/17, 5:01 PM, "Vihang Karajgaonkar"  wrote:

I see that builds are getting picked up at https://builds.apache.org/
job/PreCommit-HIVE-Build/ The PTest server was restarted a couple of days
back. If you had a patch in the queue around that time, may be it was lost
due to restart. You may have to re-attach the patch on the upstream JIRA if
that is the case.


On Mon, Apr 24, 2017 at 2:44 PM, Eugene Koifman 
wrote:

> Hi,
> It is not picking up new patches and there are none queued up.  Could
> someone check please?
>
> Thanks,
> Eugene
>
>




Re: ptest not running?

2017-04-24 Thread Vihang Karajgaonkar
I see that builds are getting picked up at https://builds.apache.org/
job/PreCommit-HIVE-Build/ The PTest server was restarted a couple of days
back. If you had a patch in the queue around that time, may be it was lost
due to restart. You may have to re-attach the patch on the upstream JIRA if
that is the case.


On Mon, Apr 24, 2017 at 2:44 PM, Eugene Koifman 
wrote:

> Hi,
> It is not picking up new patches and there are none queued up.  Could
> someone check please?
>
> Thanks,
> Eugene
>
>


Review Request 58686: HIVE-16147: Rename a partitioned table should not drop its partition columns stats

2017-04-24 Thread Chaoyu Tang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58686/
---

Review request for hive and pengcheng xiong.


Bugs: HIVE-16147
https://issues.apache.org/jira/browse/HIVE-16147


Repository: hive-git


Description
---

The patch is to
1. preserve the column stats after a partitioned table is renamed
2. rename the alter_table_invalidate_column_stats.q to 
alter_table_column_stats.q


Diffs
-

  metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
77b3541 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
1b701e0 
  ql/src/test/queries/clientpositive/alter_table_column_stats.q PRE-CREATION 
  ql/src/test/queries/clientpositive/alter_table_invalidate_column_stats.q 
a478451 
  ql/src/test/results/clientpositive/alter_table_column_stats.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/alter_table_invalidate_column_stats.q.out 
85d7dc4 


Diff: https://reviews.apache.org/r/58686/diff/1/


Testing
---

manual tests
qtests


Thanks,

Chaoyu Tang



ptest not running?

2017-04-24 Thread Eugene Koifman
Hi,
It is not picking up new patches and there are none queued up.  Could someone 
check please?

Thanks,
Eugene



[jira] [Created] (HIVE-16522) Hive is query timer is not keeping track of the fetch task execution

2017-04-24 Thread slim bouguerra (JIRA)
slim bouguerra created HIVE-16522:
-

 Summary: Hive is query timer is not keeping track of the fetch 
task execution
 Key: HIVE-16522
 URL: https://issues.apache.org/jira/browse/HIVE-16522
 Project: Hive
  Issue Type: Bug
Reporter: slim bouguerra
Assignee: slim bouguerra


Currently Hive CLI query execution time does not include fetch time execution.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16521) HoS user level explain plan possibly incorrect for UNION clause

2017-04-24 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-16521:
---

 Summary: HoS user level explain plan possibly incorrect for UNION 
clause
 Key: HIVE-16521
 URL: https://issues.apache.org/jira/browse/HIVE-16521
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 3.0.0
Reporter: Sahil Takiar
Assignee: Sahil Takiar


The user-level explain plan for queries with a UNION operator look very 
different for HoS vs. Hive-on-Tez. Furthermore, the HoS plan looks incomplete:

Query: {{EXPLAIN select count(*) from srcpart where srcpart.ds in (select 
max(srcpart.ds) from srcpart union all select min(srcpart.ds) from srcpart)}}

Hive-on-Tez:

{code}
Plan optimized by CBO.

Vertex dependency in root stage
Reducer 2 <- Map 1 (SIMPLE_EDGE), Reducer 7 (SIMPLE_EDGE)
Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE)
Reducer 5 <- Map 4 (CUSTOM_SIMPLE_EDGE), Union 6 (CONTAINS)
Reducer 7 <- Union 6 (SIMPLE_EDGE)
Reducer 9 <- Map 8 (CUSTOM_SIMPLE_EDGE), Union 6 (CONTAINS)

Stage-0
  Fetch Operator
limit:-1
Stage-1
  Reducer 3
  File Output Operator [FS_34]
Group By Operator [GBY_32] (rows=1 width=8)
  Output:["_col0"],aggregations:["count(VALUE._col0)"]
<-Reducer 2 [CUSTOM_SIMPLE_EDGE]
  PARTITION_ONLY_SHUFFLE [RS_31]
Group By Operator [GBY_30] (rows=1 width=8)
  Output:["_col0"],aggregations:["count()"]
  Merge Join Operator [MERGEJOIN_44] (rows=1000 width=8)
Conds:RS_26._col0=RS_27._col0(Inner)
  <-Map 1 [SIMPLE_EDGE]
SHUFFLE [RS_26]
  PartitionCols:_col0
  Select Operator [SEL_2] (rows=2000 width=184)
Output:["_col0"]
TableScan [TS_0] (rows=2000 width=194)
  default@srcpart,srcpart,Tbl:COMPLETE,Col:COMPLETE
  <-Reducer 7 [SIMPLE_EDGE]
SHUFFLE [RS_27]
  PartitionCols:_col0
  Group By Operator [GBY_24] (rows=1 width=184)
Output:["_col0"],keys:KEY._col0
  <-Union 6 [SIMPLE_EDGE]
<-Reducer 5 [CONTAINS]
  Reduce Output Operator [RS_23]
PartitionCols:_col0
Group By Operator [GBY_22] (rows=1 width=184)
  Output:["_col0"],keys:_col0
  Filter Operator [FIL_9] (rows=1 width=184)
predicate:_col0 is not null
Group By Operator [GBY_7] (rows=1 width=184)
  Output:["_col0"],aggregations:["max(VALUE._col0)"]
<-Map 4 [CUSTOM_SIMPLE_EDGE]
  PARTITION_ONLY_SHUFFLE [RS_6]
Group By Operator [GBY_5] (rows=1 width=184)
  Output:["_col0"],aggregations:["max(ds)"]
  Select Operator [SEL_4] (rows=2000 width=194)
Output:["ds"]
TableScan [TS_3] (rows=2000 width=194)
  
default@srcpart,srcpart,Tbl:COMPLETE,Col:COMPLETE
<-Reducer 9 [CONTAINS]
  Reduce Output Operator [RS_23]
PartitionCols:_col0
Group By Operator [GBY_22] (rows=1 width=184)
  Output:["_col0"],keys:_col0
  Filter Operator [FIL_17] (rows=1 width=184)
predicate:_col0 is not null
Group By Operator [GBY_15] (rows=1 width=184)
  Output:["_col0"],aggregations:["min(VALUE._col0)"]
<-Map 8 [CUSTOM_SIMPLE_EDGE]
  PARTITION_ONLY_SHUFFLE [RS_14]
Group By Operator [GBY_13] (rows=1 width=184)
  Output:["_col0"],aggregations:["min(ds)"]
  Select Operator [SEL_12] (rows=2000 width=194)
Output:["ds"]
TableScan [TS_11] (rows=2000 width=194)
  
default@srcpart,srcpart,Tbl:COMPLETE,Col:COMPLETE
Dynamic Partitioning Event Operator [EVENT_43] (rows=1 
width=184)
  Group By Operator [GBY_42] (rows=1 width=184)
Output:["_col0"],keys:_col0
Select Operator [SEL_41] (rows=1 width=184)
  Output:["_col0"]
   Please refer to the previous Group By Operator [GBY_24]
{code}

HoS:

{code}
Plan optimized by CBO.

Vertex dependency in root stage
Reducer 10 <- Map 9 (GROUP)
Reducer 11 <- Reducer 10 (GROUP), Reducer 13 (GROUP)
Reducer 13 <- Map 12 

[jira] [Created] (HIVE-16520) Cache hive metadata in metastore

2017-04-24 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-16520:
-

 Summary: Cache hive metadata in metastore
 Key: HIVE-16520
 URL: https://issues.apache.org/jira/browse/HIVE-16520
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


During Hive 2 benchmark, we find Hive metastore operation take a lot of time 
and thus slow down Hive compilation. In some extreme case, it takes much longer 
than the actual query run time. Especially, we find the latency of cloud db is 
very high and 90% of total query runtime is waiting for metastore SQL database 
operations. Based on this observation, the metastore operation performance will 
be greatly enhanced if we have a memory structure which cache the database 
query result.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16519) Fix exception thrown by checkOutputSpecs

2017-04-24 Thread slim bouguerra (JIRA)
slim bouguerra created HIVE-16519:
-

 Summary: Fix exception thrown by checkOutputSpecs
 Key: HIVE-16519
 URL: https://issues.apache.org/jira/browse/HIVE-16519
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: slim bouguerra
Assignee: slim bouguerra


do not throw exception by checkOutputSpecs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16518) Insert override for druid does not replace all existing segments

2017-04-24 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-16518:
---

 Summary: Insert override for druid does not replace all existing 
segments
 Key: HIVE-16518
 URL: https://issues.apache.org/jira/browse/HIVE-16518
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Insert override for Druid does not replace segments for all intervals. 
It just replaces segments for the intervals which are newly ingested. 
INSERT OVERRIDE TABLE statement on DruidStorageHandler should override all 
existing segments for the table. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16517) HiveStatement thread safety issues

2017-04-24 Thread Peter Vary (JIRA)
Peter Vary created HIVE-16517:
-

 Summary: HiveStatement thread safety issues
 Key: HIVE-16517
 URL: https://issues.apache.org/jira/browse/HIVE-16517
 Project: Hive
  Issue Type: Bug
Reporter: Peter Vary


{{BeeLine}}, and {{Commands}} classes shares one HiveStatement between multiple 
threads for querying the logs, and running the queries.

We can not make the HiveStatement thread safe, but we should at least make sure 
that calling {{getQueryLog}} will not cause problems if it is called parallel 
with any of the followings: {{cancel}}, {{close}}, {{execute}}, 
{{executeAsync}}, {{executeQuery}}, {{executeUpdate}}, {{getUpdateCount}} and 
more interestingly for the {{HiveQueryResultSet.next}} too.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16516) Set storage-api.version to 3.0.0-SNAPSHOT

2017-04-24 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-16516:
---

 Summary: Set storage-api.version to 3.0.0-SNAPSHOT
 Key: HIVE-16516
 URL: https://issues.apache.org/jira/browse/HIVE-16516
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich
 Attachments: HIVE-16516.1.patch

I think the update of this property was missed during preparation to 3.0.0;
I've bumped into this after cleaning the local .m2 repo caches.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16515) Hive MapRedTask should has its own OPTS environment variable

2017-04-24 Thread chuanjie.duan (JIRA)
chuanjie.duan created HIVE-16515:


 Summary: Hive MapRedTask should has its own OPTS environment 
variable 
 Key: HIVE-16515
 URL: https://issues.apache.org/jira/browse/HIVE-16515
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 1.2.1
Reporter: chuanjie.duan
Priority: Minor


Hiveserver2 submit a mapred job using Hiveserver's variable  HADOOP_OPTS and 
HADOOP_CLIENT_OPTS, it's weird. eg: hiveserver started with 
HADOOP_CLIENT_OPTS="-Xms12000m -Xmx12000m", hive mr sub process would need 12g 
memory allocate. I think  maybe beeline should have another environment 
variable like HIVE_HEAPSIZE to overwrite HADOOP_CLIENT_OPTS when hive mr job 
submit

MapRedTask.java
public int execute(DriverContext driverContext) {
 ……
if (ShimLoader.getHadoopShims().isLocalMode(conf)) {
..
  } else {
// nothing to do - we are not running in local mode - only submitting
// the job via a child process. in this case it's appropriate that the
// child jvm use the same memory as the parent jvm
  }
 ……
String jarCmd = hiveJar + " " + ExecDriver.class.getName() + libJarsOption;

  String cmdLine = hadoopExec + " jar " + jarCmd + " -plan "
  + planPath.toString() + " " + isSilent + " " + hiveConfArgs;
  ……
 }
  // Run ExecDriver in another JVM
  executor = Runtime.getRuntime().exec(cmdLine, env, new File(workDir));

${HADOOP_HOME}/bin/hadoop
..
# Always respect HADOOP_OPTS and HADOOP_CLIENT_OPTS
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
..
export CLASSPATH=$CLASSPATH
exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16514) Decimal datatype is truncating 1st digit of the number while storing into Parquet file

2017-04-24 Thread Surendranatha Reddy (JIRA)
Surendranatha Reddy created HIVE-16514:
--

 Summary: Decimal datatype is truncating 1st digit of the number 
while storing into Parquet file
 Key: HIVE-16514
 URL: https://issues.apache.org/jira/browse/HIVE-16514
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Hive
Affects Versions: 1.1.0
 Environment: CDH 5.7
Reporter: Surendranatha Reddy
Priority: Blocker


We declared one of the Hive Parquet table column as Decimal (38,20) .
We have inserted data from one of the intermediate table using sum function 
with "insert into select " statement. 
Sum value is 9389587.19800467 . 
Where as it is getting stored as 389587.19800467
 First digit of number is getting truncated  while it getting stored in Table. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Review Request 58654: HIVE-16503: LLAP: Oversubscribe memory for noconditional task size

2017-04-24 Thread Lefty Leverenz
To unsubscribe please send a message to dev-unsubscr...@hive.apache.org as
described here:  Mailing Lists .
Thanks.

-- Lefty


On Sun, Apr 23, 2017 at 6:21 PM,  wrote:

> Please remove me from this list serve
>
> Sent from my iPhone
>
> > On Apr 23, 2017, at 8:23 PM, j.prasant...@gmail.com wrote:
> >
> >
> > ---
> > This is an automatically generated e-mail. To reply, visit:
> > https://reviews.apache.org/r/58654/
> > ---
> >
> > Review request for hive, Gunther Hagleitner and Siddharth Seth.
> >
> >
> > Bugs: HIVE-16503
> >https://issues.apache.org/jira/browse/HIVE-16503
> >
> >
> > Repository: hive-git
> >
> >
> > Description
> > ---
> >
> > HIVE-16503: LLAP: Oversubscribe memory for noconditional task size
> >
> >
> > Diffs
> > -
> >
> >  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> 8e5a9aa53d27a06c3fff58e0d47068efd7ad898f
> >  ql/pom.xml e5d063f94ca10410530146dbac56733f823756b9
> >  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java
> 637bc54fbd4803e31647d33dd9fdca2b32b3ed63
> >  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/
> LlapClusterStateForCompile.java b2e8614cea6d25b57a400ad29237b298e4ab7164
> >  ql/src/test/org/apache/hadoop/hive/ql/exec/TestOperators.java
> 57e573ab7a2b438f3884ec1eb3b285349578a785
> >  ql/src/test/queries/clientpositive/explainuser_4.q
> f58afa88e3ac50e0a97f1f83e7ffc9af2afff092
> >  ql/src/test/queries/clientpositive/tez_smb_main.q
> ee24691dd6ab60693fd5e8b22558587e8728976f
> >  ql/src/test/queries/clientpositive/tez_vector_dynpart_hashjoin_1.q
> 7dd30039f192818ced6d2fefd439fc0ee0fcdfc1
> >  ql/src/test/results/clientpositive/llap/tez_smb_main.q.out
> b583bfffcc7070a4ed868b86c3b06e4bbc876640
> >
> >
> > Diff: https://reviews.apache.org/r/58654/diff/1/
> >
> >
> > Testing
> > ---
> >
> >
> > Thanks,
> >
> > Prasanth_J
> >
>


Re: Review Request 58654: HIVE-16503: LLAP: Oversubscribe memory for noconditional task size

2017-04-24 Thread j . the . surface
Please remove me from this list serve 

Sent from my iPhone

> On Apr 23, 2017, at 8:23 PM, j.prasant...@gmail.com wrote:
> 
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58654/
> ---
> 
> Review request for hive, Gunther Hagleitner and Siddharth Seth.
> 
> 
> Bugs: HIVE-16503
>https://issues.apache.org/jira/browse/HIVE-16503
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-16503: LLAP: Oversubscribe memory for noconditional task size
> 
> 
> Diffs
> -
> 
>  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
> 8e5a9aa53d27a06c3fff58e0d47068efd7ad898f 
>  ql/pom.xml e5d063f94ca10410530146dbac56733f823756b9 
>  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
> 637bc54fbd4803e31647d33dd9fdca2b32b3ed63 
>  
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LlapClusterStateForCompile.java
>  b2e8614cea6d25b57a400ad29237b298e4ab7164 
>  ql/src/test/org/apache/hadoop/hive/ql/exec/TestOperators.java 
> 57e573ab7a2b438f3884ec1eb3b285349578a785 
>  ql/src/test/queries/clientpositive/explainuser_4.q 
> f58afa88e3ac50e0a97f1f83e7ffc9af2afff092 
>  ql/src/test/queries/clientpositive/tez_smb_main.q 
> ee24691dd6ab60693fd5e8b22558587e8728976f 
>  ql/src/test/queries/clientpositive/tez_vector_dynpart_hashjoin_1.q 
> 7dd30039f192818ced6d2fefd439fc0ee0fcdfc1 
>  ql/src/test/results/clientpositive/llap/tez_smb_main.q.out 
> b583bfffcc7070a4ed868b86c3b06e4bbc876640 
> 
> 
> Diff: https://reviews.apache.org/r/58654/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Prasanth_J
>