[jira] [Assigned] (IMPALA-12190) Renaming table will cause losing privileges for non-admin users

2024-05-22 Thread Sai Hemanth Gantasala (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sai Hemanth Gantasala reassigned IMPALA-12190:
--

Assignee: (was: Sai Hemanth Gantasala)

> Renaming table will cause losing privileges for non-admin users
> ---
>
> Key: IMPALA-12190
> URL: https://issues.apache.org/jira/browse/IMPALA-12190
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Gabor Kaszab
>Priority: Critical
>  Labels: alter-table, authorization, ranger
>
> Let's say user 'a' gets some privileges on table 't'. When this table gets 
> renamed (even by user 'a') then user 'a' loses its privileges on that table.
>  
> Repro steps:
>  # Start impala with Ranger
>  # start impala-shell as admin (-u admin)
>  # create table tmp (i int, s string) stored as parquet;
>  # grant all on table tmp to user ;
>  # grant all on table tmp to user ;
> {code:java}
> Query: show grant user  on table tmp
> +++--+---++-+--+-+-+---+--+-+
> | principal_type | principal_name | database | table | column | uri | 
> storage_type | storage_uri | udf | privilege | grant_option | create_time |
> +++--+---++-+--+-+-+---+--+-+
> | USER           |     | default  | tmp   | *      |     |          
>     |             |     | all       | false        | NULL        |
> +++--+---++-+--+-+-+---+--+-+
> Fetched 1 row(s) in 0.01s {code}
>  #  alter table tmp rename to tmp_1234;
>  # show grant user  on table tmp_1234;
> {code:java}
> Query: show grant user  on table tmp_1234
> Fetched 0 row(s) in 0.17s{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-13075) Setting very high BATCH_SIZE can blow up memory usage of fragments

2024-05-22 Thread Riza Suminto (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-13075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848720#comment-17848720
 ] 

Riza Suminto commented on IMPALA-13075:
---

I think I managed to reproduce the issue myself with TPC-DS Q97

 
{code:java}
set RUNTIME_FILTER_MODE=OFF;
set BATCH_SIZE=65536;
set MEM_LIMIT=149mb;

use tpcds_partitioned_parquet_snap;

with ssci as (
select ss_customer_sk customer_sk
  ,ss_item_sk item_sk
from store_sales,date_dim
where ss_sold_date_sk = d_date_sk
  and d_month_seq between 1199 and 1199 + 11
group by ss_customer_sk
,ss_item_sk),
csci as(
 select cs_bill_customer_sk customer_sk
  ,cs_item_sk item_sk
from catalog_sales,date_dim
where cs_sold_date_sk = d_date_sk
  and d_month_seq between 1199 and 1199 + 11
group by cs_bill_customer_sk
,cs_item_sk)
 select  sum(case when ssci.customer_sk is not null and csci.customer_sk is 
null then 1 else 0 end) store_only
  ,sum(case when ssci.customer_sk is null and csci.customer_sk is not null 
then 1 else 0 end) catalog_only
  ,sum(case when ssci.customer_sk is not null and csci.customer_sk is not 
null then 1 else 0 end) store_and_catalog
from ssci full outer join csci on (ssci.customer_sk=csci.customer_sk
   and ssci.item_sk = csci.item_sk)
limit 100;{code}
Query failed with following status in profile:
{code:java}
Query State: EXCEPTION
Impala Query State: ERROR
Query Status: Memory limit exceeded: 
ParquetColumnChunkReader::ReadDataPage() failed to allocate 258 bytes for 
decompressed data.
HDFS_SCAN_NODE (id=4) could not allocate 258.00 B without exceeding limit.
Error occurred on backend rsuminto-22746:27000 by fragment 
084e26373d5447b1:c38102460006
Memory left in process limit: 11.62 GB
Memory left in query limit: -92.75 KB
Query(084e26373d5447b1:c3810246): memory limit exceeded. Limit=149.00 
MB Reservation=116.00 MB ReservationLimit=117.00 MB OtherMemory=33.09 MB 
Total=149.09 MB Peak=149.88 MB
  Unclaimed reservations: Reservation=51.00 MB OtherMemory=0 Total=51.00 MB 
Peak=117.00 MB
  Fragment 084e26373d5447b1:c38102460005: Reservation=0 OtherMemory=0 
Total=0 Peak=3.56 MB
HDFS_SCAN_NODE (id=5): Reservation=0 OtherMemory=0 Total=0 Peak=3.02 MB
KrpcDataStreamSender (dst_id=13): Total=0 Peak=49.12 KB
  RowBatchSerialization: Total=0 Peak=6.46 KB
  CodeGen: Total=1.86 KB Peak=1.86 KB
  Fragment 084e26373d5447b1:c38102460006: Reservation=11.06 MB 
OtherMemory=12.29 MB Total=23.36 MB Peak=28.42 MB
AGGREGATION_NODE (id=7): Reservation=9.00 MB OtherMemory=8.79 MB 
Total=17.79 MB Peak=18.61 MB
  GroupingAggregator 0: Reservation=9.00 MB OtherMemory=452.00 KB 
Total=9.44 MB Peak=9.44 MB
Exprs: Total=452.00 KB Peak=452.00 KB
HASH_JOIN_NODE (id=6): Reservation=1.94 MB OtherMemory=1.64 MB Total=3.58 
MB Peak=4.58 MB
  Exprs: Total=584.00 KB Peak=584.00 KB
  Hash Join Builder (join_node_id=6): Total=584.00 KB Peak=1.07 MB
Hash Join Builder (join_node_id=6) Exprs: Total=584.00 KB Peak=584.00 KB
HDFS_SCAN_NODE (id=4): Reservation=128.00 KB OtherMemory=1.32 MB Total=1.44 
MB Peak=6.20 MB
EXCHANGE_NODE (id=13): Reservation=0 OtherMemory=0 Total=0 Peak=16.00 KB
  KrpcDeferredRpcs: Total=0 Peak=0
KrpcDataStreamSender (dst_id=14): Total=42.66 KB Peak=42.66 KB
  RowBatchSerialization: Total=0 Peak=0
  Fragment 084e26373d5447b1:c38102460001: Reservation=0 OtherMemory=0 
Total=0 Peak=3.37 MB
HDFS_SCAN_NODE (id=1): Reservation=0 OtherMemory=0 Total=0 Peak=2.83 MB
KrpcDataStreamSender (dst_id=10): Total=0 Peak=49.12 KB
  RowBatchSerialization: Total=0 Peak=6.46 KB
  CodeGen: Total=1.86 KB Peak=1.86 KB
  Fragment 084e26373d5447b1:c38102460002: Reservation=19.94 MB 
OtherMemory=16.35 MB Total=36.28 MB Peak=38.69 MB
AGGREGATION_NODE (id=3): Reservation=17.00 MB OtherMemory=9.61 MB 
Total=26.61 MB Peak=26.61 MB
  GroupingAggregator 0: Reservation=17.00 MB OtherMemory=452.00 KB 
Total=17.44 MB Peak=17.44 MB
Exprs: Total=452.00 KB Peak=452.00 KB
HASH_JOIN_NODE (id=2): Reservation=1.94 MB OtherMemory=1.64 MB Total=3.58 
MB Peak=4.58 MB
  Exprs: Total=584.00 KB Peak=584.00 KB
  Hash Join Builder (join_node_id=2): Total=584.00 KB Peak=1.07 MB
Hash Join Builder (join_node_id=2) Exprs: Total=584.00 KB Peak=584.00 KB
HDFS_SCAN_NODE (id=0): Reservation=1.00 MB OtherMemory=4.32 MB Total=5.32 
MB Peak=7.96 MB
  Queued Batches: Total=4.32 MB Peak=5.64 MB
EXCHANGE_NODE (id=10): Reservation=0 OtherMemory=0 Total=0 Peak=16.00 KB
  KrpcDeferredRpcs: Total=0 Peak=0
KrpcDataStreamSender (dst_id=11): Total=254.71 KB Peak=294.71 KB
  RowBatchSerialization: Total=128.05 KB Peak=144.05 KB
  Fragment 084e26373d5447b1:c38102460009: Reservation=34.00 MB 
OtherMemory=3.84 MB Total=37.84 MB Peak=37.84 MB
AGGREGATION_NODE (id=9): Total=4.00 KB Peak=4.00 KB
  

[jira] [Resolved] (IMPALA-12657) Improve ProcessingCost of ScanNode and NonGroupingAggregator

2024-05-22 Thread Riza Suminto (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto resolved IMPALA-12657.
---
Resolution: Fixed

> Improve ProcessingCost of ScanNode and NonGroupingAggregator
> 
>
> Key: IMPALA-12657
> URL: https://issues.apache.org/jira/browse/IMPALA-12657
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 4.3.0
>Reporter: Riza Suminto
>Assignee: David Rorke
>Priority: Major
> Fix For: Impala 4.4.0
>
> Attachments: profile_1f4d7a679a3e12d5_42231157.txt
>
>
> Several benchmark run measuring Impala scan performance indicates some 
> costing improvement opportunity around ScanNode and NonGroupingAggregator.
> [^profile_1f4d7a679a3e12d5_42231157.txt] shows an example of simple 
> count query.
> Key takeaway:
>  # There is a strong correlation between total materialized bytes (row-size * 
> cardinality) with total materialized tuple time per fragment. Row 
> materialization cost should be adjusted to be based on this row-sized instead 
> of equal cost per scan range.
>  # NonGroupingAggregator should have much lower cost that GroupingAggregator. 
> In example above, the cost of NonGroupingAggregator dominates the scan 
> fragment even though it only does simple counting instead of hash table 
> operation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-13106) Support larger imported query profile sizes through compression

2024-05-22 Thread Surya Hebbar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-13106 started by Surya Hebbar.
-
> Support larger imported query profile sizes through compression
> ---
>
> Key: IMPALA-13106
> URL: https://issues.apache.org/jira/browse/IMPALA-13106
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
>
> Imported query profiles are currently being stored in IndexedDB.
> Although IndexedDB does not have storage limitations like other browser 
> storage APIs, there is a limit on the data that can be stored in one 
> attribute/field.
> This imposes a limitation on the size of query profiles. After some testing, 
> I have found this limit to be around 220 MBs.
> So, it would be helpful to use compression on JSON query profiles, allowing 
> for much larger query profiles.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-13106) Support larger imported query profile sizes through compression

2024-05-22 Thread Surya Hebbar (Jira)
Surya Hebbar created IMPALA-13106:
-

 Summary: Support larger imported query profile sizes through 
compression
 Key: IMPALA-13106
 URL: https://issues.apache.org/jira/browse/IMPALA-13106
 Project: IMPALA
  Issue Type: Improvement
Reporter: Surya Hebbar
Assignee: Surya Hebbar


Imported query profiles are currently being stored in IndexedDB.

Although IndexedDB does not have storage limitations like other browser storage 
APIs, there is a limit on the data that can be stored in one attribute/field.

This imposes a limitation on the size of query profiles. After some testing, I 
have found this limit to be around 220 MBs.

So, it would be helpful to use compression on JSON query profiles, allowing for 
much larger query profiles.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-13105) Multiple imported query profiles fail to import/clear at once

2024-05-22 Thread Surya Hebbar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848606#comment-17848606
 ] 

Surya Hebbar edited comment on IMPALA-13105 at 5/22/24 1:24 PM:


Clearing the imported query profiles fails, because the page reloads before 
clearing the object store.

This can fixed by calling page reload after clearing the object store succeeds.


was (Author: JIRAUSER299620):
Clearing the imported query profiles fails, because the page reloads before 
verifying the success of clearing the object store.

This can fixed by calling page reload after verifying success of clearing the 
object store.

> Multiple imported query profiles fail to import/clear at once
> -
>
> Key: IMPALA-13105
> URL: https://issues.apache.org/jira/browse/IMPALA-13105
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
>
> When multiple query profiles are chosen at once, the last query profile in 
> the insertion queue fails as the page reloads without providing a delay for 
> inserting it.
>  
> The same behavior is seen when clearing all the query profiles.
>  
> This is mostly seen in Chromium based browsers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-13105) Multiple imported query profiles fail to import/clear at once

2024-05-22 Thread Surya Hebbar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surya Hebbar updated IMPALA-13105:
--
Description: 
When multiple query profiles are chosen at once, the last query profile in the 
insertion queue fails as the page reloads without providing a delay for 
inserting it.

 

The same behavior is seen when clearing all the query profiles.

 

This is mostly seen in Chromium based browsers.

  was:
When multiple query profiles are chosen at once, they fail to be imported into 
IndexedDB.

 

The same behavior is seen when clearing all the query profiles.

 

This is mostly seen in Chromium based browsers.


> Multiple imported query profiles fail to import/clear at once
> -
>
> Key: IMPALA-13105
> URL: https://issues.apache.org/jira/browse/IMPALA-13105
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
>
> When multiple query profiles are chosen at once, the last query profile in 
> the insertion queue fails as the page reloads without providing a delay for 
> inserting it.
>  
> The same behavior is seen when clearing all the query profiles.
>  
> This is mostly seen in Chromium based browsers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-13105) Multiple imported query profiles fail to import/clear at once

2024-05-22 Thread Surya Hebbar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surya Hebbar updated IMPALA-13105:
--
Summary: Multiple imported query profiles fail to import/clear at once  
(was: Multiple query profiles fail to import/clear at once)

> Multiple imported query profiles fail to import/clear at once
> -
>
> Key: IMPALA-13105
> URL: https://issues.apache.org/jira/browse/IMPALA-13105
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
>
> When multiple query profiles are chosen at once, they fail to be imported 
> into IndexedDB.
>  
> The same behavior is seen when clearing all the query profiles.
>  
> This is mostly seen in Chromium based browsers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-13105) Multiple query profiles fail to import/clear at once

2024-05-22 Thread Surya Hebbar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848607#comment-17848607
 ] 

Surya Hebbar commented on IMPALA-13105:
---

Importing multiple query profiles fails as there is no delay between the last 
query insertion and reloading the page.

This can fixed by calling page reload with a small amount of delay after query 
insertion.

> Multiple query profiles fail to import/clear at once
> 
>
> Key: IMPALA-13105
> URL: https://issues.apache.org/jira/browse/IMPALA-13105
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
>
> When multiple query profiles are chosen at once, they fail to be imported 
> into IndexedDB.
>  
> The same behavior is seen when clearing all the query profiles.
>  
> This is mostly seen in Chromium based browsers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-13105) Multiple query profiles fail to import/clear at once

2024-05-22 Thread Surya Hebbar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848606#comment-17848606
 ] 

Surya Hebbar commented on IMPALA-13105:
---

Clearing the imported query profiles fails, because the page reloads before 
verifying the success of clearing the object store.

This can fixed by calling page reload after verifying success of clearing the 
object store.

> Multiple query profiles fail to import/clear at once
> 
>
> Key: IMPALA-13105
> URL: https://issues.apache.org/jira/browse/IMPALA-13105
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
>
> When multiple query profiles are chosen at once, they fail to be imported 
> into IndexedDB.
>  
> The same behavior is seen when clearing all the query profiles.
>  
> This is mostly seen in Chromium based browsers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-13105) Multiple query profiles fail to import/clear at once

2024-05-22 Thread Surya Hebbar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surya Hebbar updated IMPALA-13105:
--
Description: 
When multiple query profiles are chosen at once, they fail to be imported into 
IndexedDB.

 

The same behavior is seen when clearing all the query profiles.

 

This is mostly seen in Chromium based browsers.

  was:
When multiple query profiles are chosen at once, they fail to be imported into 
IndexedDB.

 

The same behavior is seen when clearing all the query profiles. Page reload is 
triggered before verifying the success of clearing object store.

 

This is mostly seen in Chromium based browsers.


> Multiple query profiles fail to import/clear at once
> 
>
> Key: IMPALA-13105
> URL: https://issues.apache.org/jira/browse/IMPALA-13105
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
>
> When multiple query profiles are chosen at once, they fail to be imported 
> into IndexedDB.
>  
> The same behavior is seen when clearing all the query profiles.
>  
> This is mostly seen in Chromium based browsers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-13105) Multiple query profiles fail to import/clear at once

2024-05-22 Thread Surya Hebbar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surya Hebbar updated IMPALA-13105:
--
Description: 
When multiple query profiles are chosen at once, they fail to be imported into 
IndexedDB.

 

The same behavior is seen when clearing all the query profiles. Page reload is 
triggered before verifying the success of clearing object store.

 

This is mostly seen in Chromium based browsers.

  was:
When multiple query profiles are chosen at once, they fail to be imported into 
IndexedDB.

 

The same behavior is seen when clearing all the query profiles. Page reload is 
triggered before verifying the success of clearing object store.

 

This behavior is mostly seen in Chromium based browsers.


> Multiple query profiles fail to import/clear at once
> 
>
> Key: IMPALA-13105
> URL: https://issues.apache.org/jira/browse/IMPALA-13105
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
>
> When multiple query profiles are chosen at once, they fail to be imported 
> into IndexedDB.
>  
> The same behavior is seen when clearing all the query profiles. Page reload 
> is triggered before verifying the success of clearing object store.
>  
> This is mostly seen in Chromium based browsers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-13105) Multiple query profiles fail to import/clear at once

2024-05-22 Thread Surya Hebbar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surya Hebbar updated IMPALA-13105:
--
Description: 
When multiple query profiles are chosen at once, they fail to be imported into 
IndexedDB.

 

The same behavior is seen when clearing all the query profiles. Page reload is 
triggered before verifying the success of clearing object store.

 

This behavior is mostly seen in Chromium based browsers.

  was:
When multiple query profiles are chosen at once, they fail to be imported into 
IndexedDB.

 

This behavior is mostly seen in Chromium based browsers.


> Multiple query profiles fail to import/clear at once
> 
>
> Key: IMPALA-13105
> URL: https://issues.apache.org/jira/browse/IMPALA-13105
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
>
> When multiple query profiles are chosen at once, they fail to be imported 
> into IndexedDB.
>  
> The same behavior is seen when clearing all the query profiles. Page reload 
> is triggered before verifying the success of clearing object store.
>  
> This behavior is mostly seen in Chromium based browsers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-13105) Multiple query profiles fail to import/clear at once

2024-05-22 Thread Surya Hebbar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surya Hebbar updated IMPALA-13105:
--
Summary: Multiple query profiles fail to import/clear at once  (was: 
Multiple query profiles fail to import at once)

> Multiple query profiles fail to import/clear at once
> 
>
> Key: IMPALA-13105
> URL: https://issues.apache.org/jira/browse/IMPALA-13105
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
>
> When multiple query profiles are chosen at once, they fail to be imported 
> into IndexedDB.
>  
> This behavior is mostly seen in Chromium based browsers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-13105) Multiple query profiles fail to import at once

2024-05-22 Thread Surya Hebbar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-13105 started by Surya Hebbar.
-
> Multiple query profiles fail to import at once
> --
>
> Key: IMPALA-13105
> URL: https://issues.apache.org/jira/browse/IMPALA-13105
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
>
> When multiple query profiles are chosen at once, they fail to be imported 
> into IndexedDB.
>  
> This behavior is mostly seen in Chromium based browsers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-13105) Multiple query profiles fail to import at once

2024-05-22 Thread Surya Hebbar (Jira)
Surya Hebbar created IMPALA-13105:
-

 Summary: Multiple query profiles fail to import at once
 Key: IMPALA-13105
 URL: https://issues.apache.org/jira/browse/IMPALA-13105
 Project: IMPALA
  Issue Type: Bug
Reporter: Surya Hebbar
Assignee: Surya Hebbar


When multiple query profiles are chosen at once, they fail to be imported into 
IndexedDB.

 

This behavior is mostly seen in Chromium based browsers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-13104) Impala allows creation of managed non-acid table

2024-05-22 Thread Jira
Zoltán Borók-Nagy created IMPALA-13104:
--

 Summary: Impala allows creation of managed non-acid table
 Key: IMPALA-13104
 URL: https://issues.apache.org/jira/browse/IMPALA-13104
 Project: IMPALA
  Issue Type: Bug
Reporter: Zoltán Borók-Nagy


One is able to create a managed non-acid table using the following statement:

{noformat}
CREATE TABLE managed_non_acid
(event string)
PARTITIONED BY (century int)
TBLPROPERTIES ('transactional' = 'false', 
'transactional_properties'='insert_only');
{noformat}

The table properties contradict each other so Impala should raise an error.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12190) Renaming table will cause losing privileges for non-admin users

2024-05-22 Thread Gabor Kaszab (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848515#comment-17848515
 ] 

Gabor Kaszab commented on IMPALA-12190:
---

I don't think this can be trivially implemented from Impala side. I recall we 
also opened a Ranger ticket after the analysis of this issue and agreed that 
first Ranger should be able to give some API that the clients can use when some 
resources were renamed.

> Renaming table will cause losing privileges for non-admin users
> ---
>
> Key: IMPALA-12190
> URL: https://issues.apache.org/jira/browse/IMPALA-12190
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Gabor Kaszab
>Assignee: Sai Hemanth Gantasala
>Priority: Critical
>  Labels: alter-table, authorization, ranger
>
> Let's say user 'a' gets some privileges on table 't'. When this table gets 
> renamed (even by user 'a') then user 'a' loses its privileges on that table.
>  
> Repro steps:
>  # Start impala with Ranger
>  # start impala-shell as admin (-u admin)
>  # create table tmp (i int, s string) stored as parquet;
>  # grant all on table tmp to user ;
>  # grant all on table tmp to user ;
> {code:java}
> Query: show grant user  on table tmp
> +++--+---++-+--+-+-+---+--+-+
> | principal_type | principal_name | database | table | column | uri | 
> storage_type | storage_uri | udf | privilege | grant_option | create_time |
> +++--+---++-+--+-+-+---+--+-+
> | USER           |     | default  | tmp   | *      |     |          
>     |             |     | all       | false        | NULL        |
> +++--+---++-+--+-+-+---+--+-+
> Fetched 1 row(s) in 0.01s {code}
>  #  alter table tmp rename to tmp_1234;
>  # show grant user  on table tmp_1234;
> {code:java}
> Query: show grant user  on table tmp_1234
> Fetched 0 row(s) in 0.17s{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12190) Renaming table will cause losing privileges for non-admin users

2024-05-22 Thread Quanlong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848508#comment-17848508
 ] 

Quanlong Huang commented on IMPALA-12190:
-

Column masking and row filtering policies will also be messed up by RENAME. I 
think tag based policy will also be messed up if data lineages are not updated 
accordingly.

+1 for a new Ranger API that returns all policies matching a given table (and 
optionally for a given user). We also need this to improve IMPALA-11501 to 
avoid loading the table schema from HMS. Currently, to check whether a user has 
a corresponding column masking policy on a table, we have to load the table to 
get all the column names and check whether there are policies on each column, 
which is inefficient.

> Renaming table will cause losing privileges for non-admin users
> ---
>
> Key: IMPALA-12190
> URL: https://issues.apache.org/jira/browse/IMPALA-12190
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Gabor Kaszab
>Assignee: Sai Hemanth Gantasala
>Priority: Critical
>  Labels: alter-table, authorization, ranger
>
> Let's say user 'a' gets some privileges on table 't'. When this table gets 
> renamed (even by user 'a') then user 'a' loses its privileges on that table.
>  
> Repro steps:
>  # Start impala with Ranger
>  # start impala-shell as admin (-u admin)
>  # create table tmp (i int, s string) stored as parquet;
>  # grant all on table tmp to user ;
>  # grant all on table tmp to user ;
> {code:java}
> Query: show grant user  on table tmp
> +++--+---++-+--+-+-+---+--+-+
> | principal_type | principal_name | database | table | column | uri | 
> storage_type | storage_uri | udf | privilege | grant_option | create_time |
> +++--+---++-+--+-+-+---+--+-+
> | USER           |     | default  | tmp   | *      |     |          
>     |             |     | all       | false        | NULL        |
> +++--+---++-+--+-+-+---+--+-+
> Fetched 1 row(s) in 0.01s {code}
>  #  alter table tmp rename to tmp_1234;
>  # show grant user  on table tmp_1234;
> {code:java}
> Query: show grant user  on table tmp_1234
> Fetched 0 row(s) in 0.17s{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-12190) Renaming table will cause losing privileges for non-admin users

2024-05-22 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848439#comment-17848439
 ] 

Fang-Yu Rao edited comment on IMPALA-12190 at 5/22/24 6:39 AM:
---

This JIRA does not seem to be straightforward to resolve on the Impala side 
alone because the error handling could be tricky. I think we may need Apache 
Ranger to provide an API that could take care of this for us (Apache Impala). 
Specifically, it would be great if there is a Ranger API that is able to modify 
the policies accordingly when the catalog server alters the name of a table. 
For instance, when the catalog server is executing ALTER TABLE RENAME, the 
catalog server also sends to the Ranger server via Impala's Ranger plug-in a 
request to change the name of the table in Ranger's policy repository if there 
is a policy matching this table. Ranger stores its policies in its backend 
database, so it would be much easier for Ranger to manage this operation, 
especially when there is an error/exception  that occurs during the execution 
of the operation.

 

If we'd like to resolve this from Apache Impala alone, then we have to be able 
to do the following properly.
 # Retrieve the policy matching the name of the table whose name is going to be 
altered.
 # For each grantee principal (which could be a user, group, or a role) in the 
policy retrieved above, invoke the REVOKE API to revoke this grantee's 
privileges on the old table (the table before the renaming) and then invoke the 
GRANT API to grant those previously revoked privileges to this grantee on the 
new table (the table with the new name). A grantee could have multiple 
privileges on the table so multiple REVOKE/GRANT API calls could be required.

It seems a bit tricky to handle the errors that occur during the 2nd step 
described above. For instance, assume that a grantee only has only one 
privilege granted on the old table, what should the catalog server do when the 
GRANT API call fails after its corresponding REVOKE API call? Should we roll 
back the REVOKE API call? Or should we retry the GRANT API call?

The policy for a table could also involve multiple principals. What should we 
do when the operation corresponding to a grantee principal fails?

 

On the other hand, there does not seem to be a Ranger API that allows us to 
retrieve the exact policy matching a given table name.
There is a Ranger API that could return an access control list (ACL) given the 
name of a resource, e.g., the table "functional.alltypes". A place where we 
call this is within RangerImpaladAuthorizationManager#getPrivileges() 
([plugin_.get().getResourceACLs(request)|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java#L367]),
 which could be triggered by a statement like "SHOW GRANT USER non_owner ON 
TABLE functional.alltypes".

For instance, given the table name "functional.alltypes", we could get a 
HashMap called "userACLs", and the contents of this map could look like the 
following. Note that in the following, only the first map corresponds to the 
policy in which the resource is exactly the table "functional.alltypes". This 
policy was created by an administrative user via "GRANT SELECT ON TABLE 
functional.alltypes to USER non_owner". The rest of the maps were inferred by 
other policies. Take the 2nd map, the user "hdfs" has the privileges on the 
table "functional.alltypes" through the policy that grants "hdfs" the ALL 
privilege on all the databases, tables, and columns.
 # "non_owner" -> \{"select" -> "ALLOWED"}
 # "hdfs" -> \{"all" -> "ALLOWED", "drop" -> "ALLOWED", ...}
 # "admin" -> \{"drop" -> "ALLOWED", "all" -> "ALLOWED", ...}
 # "\{OWNER}" -> \{"all" -> "ALLOWED", "drop" -> "ALLOWED", ...}

Tagged [~stigahuang] and [~csringhofer] here since they are also experts in 
this area on the Impala side.

Tagged [~rmani] and [~abhayk] here too since they are the experts on the Ranger 
side.


was (Author: fangyurao):
This JIRA does not seem to be straightforward to resolve on the Impala side 
alone because the error handling could be tricky. I think we may need Apache 
Ranger to provide an API that could take care of this for us (Apache Impala). 
Specifically, it would be great if there is a Ranger API that is able to modify 
the policies accordingly when the catalog server alters the name of a table. 
For instance, when the catalog server is executing ALTER TABLE RENAME, the 
catalog server also sends to the Ranger server via Impala's Ranger plug-in a 
request to change the name of the table in Ranger's policy repository if there 
is a policy matching this table. Ranger stores its policies in its backend 
database, so it would be much easier for Ranger to manage this operation, 
especially when there is an error/exception  that occurs during the execution 
of the 

[jira] [Comment Edited] (IMPALA-12190) Renaming table will cause losing privileges for non-admin users

2024-05-22 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848439#comment-17848439
 ] 

Fang-Yu Rao edited comment on IMPALA-12190 at 5/22/24 6:36 AM:
---

This JIRA does not seem to be straightforward to resolve on the Impala side 
alone because the error handling could be tricky. I think we may need Apache 
Ranger to provide an API that could take care of this for us (Apache Impala). 
Specifically, it would be great if there is a Ranger API that is able to modify 
the policies accordingly when the catalog server alters the name of a table. 
For instance, when the catalog server is executing ALTER TABLE RENAME, the 
catalog server also sends to the Ranger server via Impala's Ranger plug-in a 
request to change the name of the table in Ranger's policy repository if there 
is a policy matching this table. Ranger stores its policies in its backend 
database, so it would be much easier for Ranger to manage this operation, 
especially when there is an error/exception  that occurs during the execution 
of the operation.

 

If we'd like to resolve this from Apache Impala alone, then we have to be able 
to do the following properly.
 # Retrieve the policy matching the name of the table whose name is going to be 
altered.
 # For each grantee principal (which could be a user, group, or a role) in the 
policy retrieved above, invoke the REVOKE API to revoke this grantee's 
privileges on the old table (the table before the renaming) and then invoke the 
GRANT API to grant those previously revoked privileges to this grantee on the 
new table (the table with the new name). A grantee could have multiple 
privileges on the table so multiple REVOKE/GRANT API calls could be required.

It seems a bit tricky to handle the errors that occur during the 2nd step 
described above. For instance, assume that a grantee only has only one 
privilege granted on the old table, what should the catalog server do when the 
GRANT API call fails after its corresponding REVOKE API call? Should we roll 
back the REVOKE API call? Or should we retry the GRANT API call?

The policy for a table could also involve multiple principals. What should we 
do when the operation corresponding to a grantee principal fails?

 

On the other hand, there does not seem to be a Ranger API that allows us to 
retrieve the exact policy matching a given table name.
There is a Ranger API that could return an access control list (ACL) given the 
name of a resource, e.g., the table "functional.alltypes". A place where we 
call this is within RangerImpaladAuthorizationManager#getPrivileges() 
([plugin_.get().getResourceACLs(request)|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java#L367]),
 which could be triggered by a statement like "SHOW GRANT USER non_owner ON 
TABLE functional.alltypes".

For instance, given the table name "functional.alltypes", we could get a 
HashMap called "userACLs", and the contents of this map could look like the 
following. Note that in the following, only the first map corresponds to the 
policy in which the resource is exactly the table "functional.alltypes". This 
policy was created by an administrative user via "GRANT SELECT ON TABLE 
functional.alltypes to USER non_owner". The rest of the maps were inferred by 
other policies. Take the 2nd map, the user "hdfs" has the privileges on the 
table "functional.alltypes" through the policy that grants "hdfs" the ALL 
privilege on all the databases, tables, and columns.
 # "non_owner" -> \{"select" -> "ALLOWED"}
 # "hdfs" -> \{"all" -> "ALLOWED", "drop" -> "ALLOWED", ...}
 # "admin" -> \{"drop" -> "ALLOWED", "all" -> "ALLOWED", ...}
 # "\{OWNER}" -> \{"all" -> "ALLOWED", "drop" -> "ALLOWED", ...}


was (Author: fangyurao):
This JIRA does not seem to be straightforward to resolve on the Impala side 
alone because the error handling could be tricky. I think we may need Apache 
Ranger to provide an API that could take care of this for us (Apache Impala). 
Specifically, it would be great if there is a Ranger API that is able to modify 
the policies accordingly when the catalog server alters the name of a table. 
For instance, when the catalog server is executing ALTER TABLE RENAME, the 
catalog server also sends to the Ranger server via Impala's Ranger plug-in a 
request to change the name of the table in Ranger's policy repository if there 
is a policy matching this table. Ranger stores its policies in its backend 
database, so it would be much easier for Ranger to manage this operation, 
especially when there is an error/exception  that occurs during the execution 
of the operation.

 

If we'd like to resolve this from Apache Impala alone, then we have to be able 
to do the following properly.
 # Retrieve the policy matching the name of the table whose name is going to be 

[jira] [Comment Edited] (IMPALA-12190) Renaming table will cause losing privileges for non-admin users

2024-05-22 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848439#comment-17848439
 ] 

Fang-Yu Rao edited comment on IMPALA-12190 at 5/22/24 6:35 AM:
---

This JIRA does not seem to be straightforward to resolve on the Impala side 
alone because the error handling could be tricky. I think we may need Apache 
Ranger to provide an API that could take care of this for us (Apache Impala). 
Specifically, it would be great if there is a Ranger API that is able to modify 
the policies accordingly when the catalog server alters the name of a table. 
For instance, when the catalog server is executing ALTER TABLE RENAME, the 
catalog server also sends to the Ranger server via Impala's Ranger plug-in a 
request to change the name of the table in Ranger's policy repository if there 
is a policy matching this table. Ranger stores its policies in its backend 
database, so it would be much easier for Ranger to manage this operation, 
especially when there is an error/exception  that occurs during the execution 
of the operation.

 

If we'd like to resolve this from Apache Impala alone, then we have to be able 
to do the following properly.
 # Retrieve the policy matching the name of the table whose name is going to be 
altered.
 # For each grantee principal (which could be a user, group, or a role) in the 
policy retrieved above, invoke the REVOKE API to revoke this grantee's 
privileges on the old table (the table before the renaming) and then invoke the 
GRANT API to grant those previously revoked privileges to this grantee on the 
new table (the table with the new name). A grantee could have multiple 
privileges on the table so multiple REVOKE/GRANT could be required.

It seems a bit tricky to handle the errors that occur during the 2nd step 
described above. For instance, assume that a grantee only has only one 
privilege granted on the old table, what should the catalog server do when the 
GRANT API call fails after its corresponding REVOKE API call? Should we roll 
back the REVOKE API call? Or should we retry the GRANT API call?

The policy for a table could also involve multiple principals. What should we 
do when the operation corresponding to a grantee principal fails?

 

On the other hand, there does not seem to be a Ranger API that allows us to 
retrieve the exact policy matching a given table name.
There is a Ranger API that could return an access control list (ACL) given the 
name of a resource, e.g., the table "functional.alltypes". A place where we 
call this is within RangerImpaladAuthorizationManager#getPrivileges() 
([plugin_.get().getResourceACLs(request)|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java#L367]),
 which could be triggered by a statement like "SHOW GRANT USER non_owner ON 
TABLE functional.alltypes".

For instance, given the table name "functional.alltypes", we could get a 
HashMap called "userACLs", and the contents of this map could look like the 
following. Note that in the following, only the first map corresponds to the 
policy in which the resource is exactly the table "functional.alltypes". This 
policy was created by an administrative user via "GRANT SELECT ON TABLE 
functional.alltypes to USER non_owner". The rest of the maps were inferred by 
other policies. Take the 2nd map, the user "hdfs" has the privileges on the 
table "functional.alltypes" through the policy that grants "hdfs" the ALL 
privilege on all the databases, tables, and columns.
 # "non_owner" -> \{"select" -> "ALLOWED"}
 # "hdfs" -> \{"all" -> "ALLOWED", "drop" -> "ALLOWED", ...}
 # "admin" -> \{"drop" -> "ALLOWED", "all" -> "ALLOWED", ...}
 # "\{OWNER}" -> \{"all" -> "ALLOWED", "drop" -> "ALLOWED", ...}


was (Author: fangyurao):
This JIRA does not seem to be straightforward to resolve on the Impala side 
alone because the error handling could be tricky. I think we may need Apache 
Ranger to provide an API that could take care of this for us (Apache Impala). 
Specifically, it would be great if there is a Ranger API that is able to modify 
the policies accordingly when the catalog server alters the name of a table. 
For instance, when the catalog server is executing ALTER TABLE RENAME, the 
catalog server also sends to the Ranger server via Impala's Ranger plug-in a 
request to change the name of the table in Ranger's policy repository if there 
is a policy matching this table. Ranger stores its policies in its backend 
database, so it would be much easier for Ranger to manage this operation, 
especially when there is an error/exception  that occurs during the execution 
of the operation.

 

If we'd like to resolve this from Apache Impala alone, then we have to be able 
to do the following properly.
 # Retrieve the policy matching the name of the table whose name is going to be 
altered.
 

[jira] [Comment Edited] (IMPALA-12190) Renaming table will cause losing privileges for non-admin users

2024-05-22 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848439#comment-17848439
 ] 

Fang-Yu Rao edited comment on IMPALA-12190 at 5/22/24 6:34 AM:
---

This JIRA does not seem to be straightforward to resolve on the Impala side 
alone because the error handling could be tricky. I think we may need Apache 
Ranger to provide an API that could take care of this for us (Apache Impala). 
Specifically, it would be great if there is a Ranger API that is able to modify 
the policies accordingly when the catalog server alters the name of a table. 
For instance, when the catalog server is executing ALTER TABLE RENAME, the 
catalog server also sends to the Ranger server via Impala's Ranger plug-in a 
request to change the name of the table in Ranger's policy repository if there 
is a policy matching this table. Ranger stores its policies in its backend 
database, so it would be much easier for Ranger to manage this operation, 
especially when there is an error/exception  that occurs during the execution 
of the operation.

 

If we'd like to resolve this from Apache Impala alone, then we have to be able 
to do the following properly.
 # Retrieve the policy matching the name of the table whose name is going to be 
altered.
 # For each grantee principal in the policy (which could be a user, group, or a 
role) in the policy retrieved above, invoke the REVOKE API to revoke this 
grantee's privileges on the old table (the table before the renaming) and then 
invoke the GRANT API to grant those previously revoked privileges to this 
grantee on the new table (the table with the new name). A grantee could have 
multiple privileges on the table so multiple REVOKE/GRANT could be required.

It seems a bit tricky to handle the errors that occur during the 2nd step 
described above. For instance, assume that a grantee only has only one 
privilege granted on the old table, what should the catalog server do when the 
GRANT API call fails after its corresponding REVOKE API call? Should we roll 
back the REVOKE API call? Or should we retry the GRANT API call?

The policy for a table could also involve multiple principals. What should we 
do when the operation corresponding to a grantee principal fails?

 

On the other hand, there does not seem to be a Ranger API that allows us to 
retrieve the exact policy matching a given table name.
There is a Ranger API that could return an access control list (ACL) given the 
name of a resource, e.g., the table "functional.alltypes". A place where we 
call this is within RangerImpaladAuthorizationManager#getPrivileges() 
([plugin_.get().getResourceACLs(request)|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java#L367]),
 which could be triggered by a statement like "SHOW GRANT USER non_owner ON 
TABLE functional.alltypes".

For instance, given the table name "functional.alltypes", we could get a 
HashMap called "userACLs", and the contents of this map could look like the 
following. Note that in the following, only the first map corresponds to the 
policy in which the resource is exactly the table "functional.alltypes". This 
policy was created by an administrative user via "GRANT SELECT ON TABLE 
functional.alltypes to USER non_owner". The rest of the maps were inferred by 
other policies. Take the 2nd map, the user "hdfs" has the privileges on the 
table "functional.alltypes" through the policy that grants "hdfs" the ALL 
privilege on all the databases, tables, and columns.
 # "non_owner" -> \{"select" -> "ALLOWED"}
 # "hdfs" -> \{"all" -> "ALLOWED", "drop" -> "ALLOWED", ...}
 # "admin" -> \{"drop" -> "ALLOWED", "all" -> "ALLOWED", ...}
 # "\{OWNER}" -> \{"all" -> "ALLOWED", "drop" -> "ALLOWED", ...}


was (Author: fangyurao):
This JIRA does not seem to be straightforward to resolve on the Impala side 
alone because the error handling could be tricky. I think we may need Apache 
Ranger to provide an API that could take care of this for us (Apache Impala). 
Specifically, it would be great if there is a Ranger API that is able to modify 
the policies accordingly when the catalog server alters the name of a table. 
For instance, when the catalog server is executing ALTER TABLE RENAME, the 
catalog server also sends to the Ranger server via Impala's Ranger plug-in a 
request to change the name of the table in Ranger's policy repository if there 
is a policy matching this table. Ranger stores its policies in its backend 
database, so it would be much easier for Ranger to manage this operation, 
especially when there is an error/exception  that occurs during the execution 
of the operation.

 

If we'd like to resolve this from Apache Impala alone, then we have to be able 
to do the following properly.
 # Retrieve the policy matching the name of the table whose name is going to 

[jira] [Commented] (IMPALA-13040) SIGSEGV in QueryState::UpdateFilterFromRemote

2024-05-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848461#comment-17848461
 ] 

ASF subversion and git services commented on IMPALA-13040:
--

Commit aa01079478773aed28c9a4d8b07c062202de698d in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=aa0107947 ]

IMPALA-13040: (addendum) Inject larger delay for sanitized build

TestLateQueryStateInit has been flaky in sanitized build because the
largest delay injection time is fixed at 3 seconds. This patch fixes
the issue by setting largest delay injection time equal to
RUNTIME_FILTER_WAIT_TIME_MS, which is 3 second for regular build and 10
seconds for sanitized build.

Testing:
- Loop and pass test_runtime_filter_aggregation.py 10 times in ASAN
  build and 50 times in UBSAN build.

Change-Id: I09e5ae4646f53632e9a9f519d370a33a5534df19
Reviewed-on: http://gerrit.cloudera.org:8080/21439
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> SIGSEGV in  QueryState::UpdateFilterFromRemote
> --
>
> Key: IMPALA-13040
> URL: https://issues.apache.org/jira/browse/IMPALA-13040
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Riza Suminto
>Priority: Critical
> Fix For: Impala 4.5.0
>
>
> {code}
> Crash reason:  SIGSEGV /SEGV_MAPERR
> Crash address: 0x48
> Process uptime: not available
> Thread 114 (crashed)
>  0  libpthread.so.0 + 0x9d00
> rax = 0x00019e57ad00   rdx = 0x2a656720
> rcx = 0x059a9860   rbx = 0x
> rsi = 0x00019e57ad00   rdi = 0x0038
> rbp = 0x7f6233d544e0   rsp = 0x7f6233d544a8
>  r8 = 0x06a53540r9 = 0x0039
> r10 = 0x   r11 = 0x000a
> r12 = 0x00019e57ad00   r13 = 0x7f62a2f997d0
> r14 = 0x7f6233d544f8   r15 = 0x1632c0f0
> rip = 0x7f62a2f96d00
> Found by: given as instruction pointer in context
>  1  
> impalad!impala::QueryState::UpdateFilterFromRemote(impala::UpdateFilterParamsPB
>  const&, kudu::rpc::RpcContext*) [query-state.cc : 1033 + 0x5]
> rbp = 0x7f6233d54520   rsp = 0x7f6233d544f0
> rip = 0x015c0837
> Found by: previous frame's frame pointer
>  2  
> impalad!impala::DataStreamService::UpdateFilterFromRemote(impala::UpdateFilterParamsPB
>  const*, impala::UpdateFilterResultPB*, kudu::rpc::RpcContext*) 
> [data-stream-service.cc : 134 + 0xb]
> rbp = 0x7f6233d54640   rsp = 0x7f6233d54530
> rip = 0x017c05de
> Found by: previous frame's frame pointer
> {code}
> The line that crashes is 
> https://github.com/apache/impala/blob/b39cd79ae84c415e0aebec2c2b4d7690d2a0cc7a/be/src/runtime/query-state.cc#L1033
> My guess is that inside the actual segfault is within WaitForPrepare() but it 
> was inlined. Not sure if a remote filter can arrive even before 
> QueryState::Init is finished - that would explain the issue, as 
> instances_prepared_barrier_ is not yet created at that point.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12800) Queries with many nested inline views see performance issues with ExprSubstitutionMap

2024-05-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848462#comment-17848462
 ] 

ASF subversion and git services commented on IMPALA-12800:
--

Commit ae6846b1cd039b2cd6f8753ce3ff810c5b2d3ce3 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ae6846b1c ]

IMPALA-12800: Skip O(n^2) ExprSubstitutionMap::verify() for release builds

ExprSubstitutionMap::compose() and combine() call verify() to
check the new ExprSubstitutionMap for duplicates. This algorithm
is O(n^2) and can add significant overhead to SQLs with a large
number of expressions or inline views. This changes verify() to
skip the check for release builds (keeping it for debug builds).

In a query with 20+ layers of inline views and thousands of
expressions, turning off the verify() call cuts the execution
time from 51 minutes to 18 minutes.

This doesn't fully solve slowness in ExprSubstitutionMap.
Further improvement would require Expr to support hash-based
algorithms, which is a much larger change.

Testing:
 - Manual performance comparison with/without the verify() call

Change-Id: Ieeacfec6a5b487076ce5b19747319630616411f0
Reviewed-on: http://gerrit.cloudera.org:8080/21444
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> Queries with many nested inline views see performance issues with 
> ExprSubstitutionMap
> -
>
> Key: IMPALA-12800
> URL: https://issues.apache.org/jira/browse/IMPALA-12800
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 4.3.0
>Reporter: Joe McDonnell
>Priority: Critical
> Attachments: impala12800repro.sql, impala12800schema.sql, 
> long_query_jstacks.tar.gz
>
>
> A user running a query with many layers of inline views saw a large amount of 
> time spent in analysis. 
>  
> {noformat}
> - Authorization finished (ranger): 7s518ms (13.134ms)
> - Value transfer graph computed: 7s760ms (241.953ms)
> - Single node plan created: 2m47s (2m39s)
> - Distributed plan created: 2m47s (7.430ms)
> - Lineage info computed: 2m47s (39.017ms)
> - Planning finished: 2m47s (672.518ms){noformat}
> In reproducing it locally, we found that most of the stacks end up in 
> ExprSubstitutionMap.
>  
> Here are the main stacks seen while running jstack every 3 seconds during a 
> 75 second execution:
> Location 1: (ExprSubstitutionMap::compose -> contains -> indexOf -> Expr 
> equals) (4 samples)
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.impala.analysis.Expr.equals(Expr.java:1008)
>     at java.util.ArrayList.indexOf(ArrayList.java:323)
>     at java.util.ArrayList.contains(ArrayList.java:306)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:120){noformat}
> Location 2:  (ExprSubstitutionMap::compose -> verify -> Expr equals) (9 
> samples)
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.impala.analysis.Expr.equals(Expr.java:1008)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:126){noformat}
> Location 3: (ExprSubstitutionMap::combine -> verify -> Expr equals) (5 
> samples)
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.impala.analysis.Expr.equals(Expr.java:1008)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.combine(ExprSubstitutionMap.java:143){noformat}
> Location 4:  (TupleIsNullPredicate.wrapExprs ->  Analyzer.isTrueWithNullSlots 
> -> FeSupport.EvalPredicate -> Thrift serialization) (4 samples)
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>     at java.lang.StringCoding.encode(StringCoding.java:364)
>     at java.lang.String.getBytes(String.java:941)
>     at 
> org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:227)
>     at 
> org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:532)
>     at 
> org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:467)
>     at org.apache.impala.thrift.TClientRequest.write(TClientRequest.java:394)
>     at 
> org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:3034)
>     at 
> org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:2709)
>     at org.apache.impala.thrift.TQueryCtx.write(TQueryCtx.java:2400)
>     at org.apache.thrift.TSerializer.serialize(TSerializer.java:84)
>     at 
>