[jira] [Created] (HIVE-25373) Modify buildColumnStatsDesc to send configured number of stats for updation

2021-07-22 Thread mahesh kumar behera (Jira)
mahesh kumar behera created HIVE-25373:
--

 Summary: Modify buildColumnStatsDesc to send configured number of 
stats for updation
 Key: HIVE-25373
 URL: https://issues.apache.org/jira/browse/HIVE-25373
 Project: Hive
  Issue Type: Sub-task
Reporter: mahesh kumar behera
Assignee: mahesh kumar behera


The number of stats sent for updation should be controlled to avoid thrift 
error in case the size exceeds the limit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Should we release Hive Storage API 2.8.0-rc0 ?

2021-07-22 Thread Chao Sun
Thanks Owen! I just verified the checksum and gpg signature and they both
look good, so +1 too.

Panos: please fix the "Fix version" of the JIRA when you get a chance.
Thanks.

Best,
Chao


On Thu, Jul 22, 2021 at 2:29 PM Owen O'Malley 
wrote:

> Chao,
>Panos key doesn't seem to have propagated to the Apache servers. It
> referenced here:
>
> https://people.apache.org/keys/committer/ as "pgaref
> 7DFAB216AB7D96B3B2072184DC11DE4D00F8FA1D"
>
> The key itself can be found here:
>
>
> https://keyserver.ubuntu.com/pks/lookup?search=pgaref=on=index
>
>
> On Wed, Jul 21, 2021 at 8:47 PM Chao Sun  wrote:
>
> > I built the source from the branch and ran the tests, which all passed.
> > However I was not able to find the public GPG key. Panos: could you point
> > me to the location?
> >
> > Also seems we should create a new version 2.8.0 in the JIRA page:
> >
> >
> https://issues.apache.org/jira/projects/HIVE?selectedItem=com.atlassian.jira.jira-projects-plugin:release-page
> > and update "Fix version" of
> > https://issues.apache.org/jira/browse/HIVE-24458
> > .
> >
> > Chao
> >
> > On Wed, Jul 21, 2021 at 9:10 AM Szehon Ho 
> wrote:
> >
> > > +1 (binding)
> > >
> > > * Built module
> > > * Ran tests
> > > * Checked artifact checksum and signature
> > >
> > > Thanks
> > > Szehon
> > >
> > > On Tue, Jul 20, 2021 at 2:11 PM Owen O'Malley 
> > > wrote:
> > >
> > > > I think we should go ahead and release storage-api 2.8.0 and catch it
> > on
> > > > the next cycle. HIVE-25190 is a long standing bug that rarely affects
> > > > users. (We have had a user at LinkedIn hit it, which is why I fixed
> > it.)
> > > > I'll sign up to make the 2.8.1 (and 2.7.3) bug fix releases
> afterwards.
> > > >
> > > > .. Owen
> > > >
> > > > On Tue, Jul 20, 2021 at 8:53 PM Chao Sun  wrote:
> > > >
> > > > > Going to check the release and vote here too. Since HIVE-25190 is
> > > already
> > > > > merged, instead of waiting for another release, should we start
> > another
> > > > RC1
> > > > > with that included?
> > > > >
> > > > > Chao
> > > > >
> > > > > On Tue, Jul 20, 2021 at 1:30 PM Dongjoon Hyun  >
> > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > * Build and tested locally.
> > > > > >
> > > > > > Thanks,
> > > > > > Dongjoon.
> > > > > >
> > > > > > On 2021/07/19 23:15:46, "Owen O'Malley" 
> > > > wrote:
> > > > > > > +1 (binding):
> > > > > > > * Built and tested
> > > > > > > * Built hive main branch using it
> > > > > > > * Verified signatures and checksums
> > > > > > >
> > > > > > > It is too bad that we didn't get HIVE-25190 into it, but that
> can
> > > > wait
> > > > > > for
> > > > > > > 2.8.1.
> > > > > > >
> > > > > > > .. Owen
> > > > > > >
> > > > > > > On Mon, Jun 28, 2021 at 9:44 PM Pavan Lanka
> > >  > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1 (non-binding)
> > > > > > > >
> > > > > > > > I have done the following:
> > > > > > > > * Built and Tested storage-release-2.8.0-rc0 using OpenJDK8
> > > > > > > > * Built and Tested ORC with updated storage api version
> > > > > > > >   - Had to fix a test class that implements PredicateLeaf
> which
> > > > has a
> > > > > > new
> > > > > > > > method. This is a breaking change but I think this should be
> ok
> > > > > > > > * Verified the performance gains of HIVE-24458
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Pavan
> > > > > > > >
> > > > > > > >
> > > > > > > > > On Jun 21, 2021, at 8:07 AM, Panos Garefalakis <
> > > > panga...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Hello all,
> > > > > > > > >
> > > > > > > > > Following on previous discussions, I would like to propose
> a
> > > new
> > > > > > > > > storage-api release including HIVE-24458
> > > > > > > > > .
> > > > > > > > >
> > > > > > > > > Shall we release the following artifacts as Hive Storage
> API
> > > > 2.8.0?
> > > > > > > > >
> > > > > > > > > tar: http://home.apache.org/~pgaref/hive-storage-2.8.0/
> > > > > > > > > tag:
> > > > > > > >
> > > > >
> > https://github.com/apache/hive/releases/tag/storage-release-2.8.0-rc0
> > > > > > > > > jiras:
> > > > > > https://issues.apache.org/jira/projects/HIVE/versions/12350287
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Panagiotis
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Should we release Hive Storage API 2.8.0-rc0 ?

2021-07-22 Thread Owen O'Malley
Chao,
   Panos key doesn't seem to have propagated to the Apache servers. It
referenced here:

https://people.apache.org/keys/committer/ as "pgaref
7DFAB216AB7D96B3B2072184DC11DE4D00F8FA1D"

The key itself can be found here:

https://keyserver.ubuntu.com/pks/lookup?search=pgaref=on=index


On Wed, Jul 21, 2021 at 8:47 PM Chao Sun  wrote:

> I built the source from the branch and ran the tests, which all passed.
> However I was not able to find the public GPG key. Panos: could you point
> me to the location?
>
> Also seems we should create a new version 2.8.0 in the JIRA page:
>
> https://issues.apache.org/jira/projects/HIVE?selectedItem=com.atlassian.jira.jira-projects-plugin:release-page
> and update "Fix version" of
> https://issues.apache.org/jira/browse/HIVE-24458
> .
>
> Chao
>
> On Wed, Jul 21, 2021 at 9:10 AM Szehon Ho  wrote:
>
> > +1 (binding)
> >
> > * Built module
> > * Ran tests
> > * Checked artifact checksum and signature
> >
> > Thanks
> > Szehon
> >
> > On Tue, Jul 20, 2021 at 2:11 PM Owen O'Malley 
> > wrote:
> >
> > > I think we should go ahead and release storage-api 2.8.0 and catch it
> on
> > > the next cycle. HIVE-25190 is a long standing bug that rarely affects
> > > users. (We have had a user at LinkedIn hit it, which is why I fixed
> it.)
> > > I'll sign up to make the 2.8.1 (and 2.7.3) bug fix releases afterwards.
> > >
> > > .. Owen
> > >
> > > On Tue, Jul 20, 2021 at 8:53 PM Chao Sun  wrote:
> > >
> > > > Going to check the release and vote here too. Since HIVE-25190 is
> > already
> > > > merged, instead of waiting for another release, should we start
> another
> > > RC1
> > > > with that included?
> > > >
> > > > Chao
> > > >
> > > > On Tue, Jul 20, 2021 at 1:30 PM Dongjoon Hyun 
> > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > * Build and tested locally.
> > > > >
> > > > > Thanks,
> > > > > Dongjoon.
> > > > >
> > > > > On 2021/07/19 23:15:46, "Owen O'Malley" 
> > > wrote:
> > > > > > +1 (binding):
> > > > > > * Built and tested
> > > > > > * Built hive main branch using it
> > > > > > * Verified signatures and checksums
> > > > > >
> > > > > > It is too bad that we didn't get HIVE-25190 into it, but that can
> > > wait
> > > > > for
> > > > > > 2.8.1.
> > > > > >
> > > > > > .. Owen
> > > > > >
> > > > > > On Mon, Jun 28, 2021 at 9:44 PM Pavan Lanka
> >  > > >
> > > > > > wrote:
> > > > > >
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > I have done the following:
> > > > > > > * Built and Tested storage-release-2.8.0-rc0 using OpenJDK8
> > > > > > > * Built and Tested ORC with updated storage api version
> > > > > > >   - Had to fix a test class that implements PredicateLeaf which
> > > has a
> > > > > new
> > > > > > > method. This is a breaking change but I think this should be ok
> > > > > > > * Verified the performance gains of HIVE-24458
> > > > > > >
> > > > > > > Regards,
> > > > > > > Pavan
> > > > > > >
> > > > > > >
> > > > > > > > On Jun 21, 2021, at 8:07 AM, Panos Garefalakis <
> > > panga...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Hello all,
> > > > > > > >
> > > > > > > > Following on previous discussions, I would like to propose a
> > new
> > > > > > > > storage-api release including HIVE-24458
> > > > > > > > .
> > > > > > > >
> > > > > > > > Shall we release the following artifacts as Hive Storage API
> > > 2.8.0?
> > > > > > > >
> > > > > > > > tar: http://home.apache.org/~pgaref/hive-storage-2.8.0/
> > > > > > > > tag:
> > > > > > >
> > > >
> https://github.com/apache/hive/releases/tag/storage-release-2.8.0-rc0
> > > > > > > > jiras:
> > > > > https://issues.apache.org/jira/projects/HIVE/versions/12350287
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Panagiotis
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


[jira] [Created] (HIVE-25372) [Hive] Advance write ID for remaining DDLs

2021-07-22 Thread Kishen Das (Jira)
Kishen Das created HIVE-25372:
-

 Summary: [Hive] Advance write ID for remaining DDLs
 Key: HIVE-25372
 URL: https://issues.apache.org/jira/browse/HIVE-25372
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Reporter: Kishen Das
Assignee: Kishen Das


We guarantee data consistency for table metadata, when serving data from the 
HMS cache. HMS cache relies on Valid Write IDs to decide whether to serve from 
cache or refresh from the backing DB and serve, so we have to ensure we advance 
write IDs during all alter table flows. We have to ensure we advance the write 
ID for below DDLs.

AlterTableSetOwnerAnalyzer.java 
AlterTableSkewedByAnalyzer.java
AlterTableSetSerdeAnalyzer.java
AlterTableSetSerdePropsAnalyzer.java
AlterTableUnsetSerdePropsAnalyzer.java
AlterTableSetPartitionSpecAnalyzer
AlterTableClusterSortAnalyzer.java
AlterTableIntoBucketsAnalyzer.java
AlterTableConcatenateAnalyzer.java
AlterTableCompactAnalyzer.java
AlterTableSetFileFormatAnalyzer.java
AlterTableSetSkewedLocationAnalyzer.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25371) Add myself to thrift file reviewers

2021-07-22 Thread Karen Coppage (Jira)
Karen Coppage created HIVE-25371:


 Summary: Add myself to thrift file reviewers
 Key: HIVE-25371
 URL: https://issues.apache.org/jira/browse/HIVE-25371
 Project: Hive
  Issue Type: Task
Reporter: Karen Coppage
Assignee: Karen Coppage






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25370) Improve SharedWorkOptimizer performance

2021-07-22 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25370:
---

 Summary: Improve SharedWorkOptimizer performance
 Key: HIVE-25370
 URL: https://issues.apache.org/jira/browse/HIVE-25370
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


for queries which are unioning ~800 constant rows the SWO is doing around n*n/2 
operations trying to find 2 TS-es which could be merged

{code}
select constants
UNION ALL
...
UNION ALL
select constants
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25369) Handle Sum0 when rebuilding materialized view incrementally

2021-07-22 Thread Krisztian Kasa (Jira)
Krisztian Kasa created HIVE-25369:
-

 Summary: Handle Sum0 when rebuilding materialized view 
incrementally
 Key: HIVE-25369
 URL: https://issues.apache.org/jira/browse/HIVE-25369
 Project: Hive
  Issue Type: Improvement
  Components: CBO, Materialized views
Reporter: Krisztian Kasa
Assignee: Krisztian Kasa


When rewriting MV insert overwrite plan to incremental rebuild plan a Sum0 
aggregate function is used when aggregating count function subresults coming 
from the existing MV data and the aggregated newly inserted/deleted records 
since the last rebuild
{code}
create materialized view mat1 stored as orc TBLPROPERTIES 
('transactional'='true') as
select t1.a, count(*) from t1
{code}
Insert overwrite plan:
{code}
HiveAggregate(group=[{0}], agg#0=[$SUM0($1)])
  HiveUnion(all=[true])
HiveAggregate(group=[{0}], agg#0=[count()])
  HiveProject($f0=[$0])
HiveFilter(condition=[<(2, $5.writeid)])
  HiveTableScan(table=[[default, t1]], table:alias=[t1])
HiveTableScan(table=[[default, mat1]], table:alias=[default.mat1])
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25368) Code does not build in IDE and a small fix

2021-07-22 Thread Peter Vary (Jira)
Peter Vary created HIVE-25368:
-

 Summary: Code does not build in IDE and a small fix
 Key: HIVE-25368
 URL: https://issues.apache.org/jira/browse/HIVE-25368
 Project: Hive
  Issue Type: Task
Reporter: Peter Vary
Assignee: Peter Vary


The code does not build in IntelliJ because of the generic usage.

Also there is a small test case issue in {{WarehouseInstance.java}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25367) Fix TestReplicationScenariosAcidTables#testMultiDBTxn

2021-07-22 Thread Peter Vary (Jira)
Peter Vary created HIVE-25367:
-

 Summary: Fix TestReplicationScenariosAcidTables#testMultiDBTxn
 Key: HIVE-25367
 URL: https://issues.apache.org/jira/browse/HIVE-25367
 Project: Hive
  Issue Type: Test
  Components: repl
Reporter: Peter Vary


[http://ci.hive.apache.org/job/hive-flaky-check/332]

[http://ci.hive.apache.org/job/hive-flaky-check/333]

CC: [~aasha]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25366) Reduce number of Table calls in updatePartitonColStatsInternal

2021-07-22 Thread Rajesh Balamohan (Jira)
Rajesh Balamohan created HIVE-25366:
---

 Summary: Reduce number of Table calls in 
updatePartitonColStatsInternal
 Key: HIVE-25366
 URL: https://issues.apache.org/jira/browse/HIVE-25366
 Project: Hive
  Issue Type: Improvement
Reporter: Rajesh Balamohan


For every partition, table details are reloaded again which is completely 
wasteful. It will be good to pass the Table details fetched once for all 
partitions.

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java#L1091]

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L9342]

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)