[jira] [Commented] (IMPALA-7918) Remove support for authorization policy file

2019-05-02 Thread Fredy Wijaya (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832098#comment-16832098
 ] 

Fredy Wijaya commented on IMPALA-7918:
--

[~arodoni_cloudera] yeah it should be fine.

> Remove support for authorization policy file
> 
>
> Key: IMPALA-7918
> URL: https://issues.apache.org/jira/browse/IMPALA-7918
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Fredy Wijaya
>Assignee: Austin Nobis
>Priority: Critical
> Fix For: Impala 3.3.0
>
>
> Support for authorization policy file has been deprecated in Impala and it 
> does not work with object ownership. Furthermore, authorization policy file 
> is very specific to Sentry. Supporting authorization policy will make it 
> difficult to create a generic authorization framework in Impala. Hence, the 
> task will involve removing support for authorization policy file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7977) Impala Doc: Doc the support for fine-grained updates at partition level

2019-05-02 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7977:

Labels: future_release_doc in_33  (was: future_release_doc in_32)

> Impala Doc: Doc the support for fine-grained updates at partition level
> ---
>
> Key: IMPALA-7977
> URL: https://issues.apache.org/jira/browse/IMPALA-7977
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_33
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7375) Impala Doc: Doc DATE functions

2019-05-02 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7375:

Labels: future_release_doc in_33  (was: future_release_doc)

> Impala Doc: Doc DATE functions
> --
>
> Key: IMPALA-7375
> URL: https://issues.apache.org/jira/browse/IMPALA-7375
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_33
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7375) Impala Doc: Doc DATE functions

2019-05-02 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7375:

Target Version: Impala 3.3.0

> Impala Doc: Doc DATE functions
> --
>
> Key: IMPALA-7375
> URL: https://issues.apache.org/jira/browse/IMPALA-7375
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_33
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8459) Cannot delete impala/kudu table if backing kudu table dropped with local catalog

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832087#comment-16832087
 ] 

ASF subversion and git services commented on IMPALA-8459:
-

Commit 79c5f87565467074697a7d98e01c9742f7228991 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=79c5f87 ]

IMPALA-8121: part 1: some test fixes for catalog v2

This fixes some test issues encountered when running the
tests against a cluster with catalog V2 enabled, meaning
the local catalog with HMS notifications enabled. More
fixes are to come but I preferred to do them in smaller
batches as they're ready.

Test fixes:
* Detect whether catalog v2 features are enabled from web UI.
* test_describe_db waits for metadata event processor to pick up new
  database and doesn't need to change database owner
* TestWebPage.test_catalog handles an expected exception from
  the /catalog_objects page on the impalad.
* test_pull_stats_profile: feature disabled with local catalog
* test_hms_service_dies: invalidate the test table instead of
  the whole catalog.
* test_compute_stats: Avro schema resolution behaviour changed
  with local catalog - IMPALA-7308

Some remaining issues:
* IMPALA-8458
* IMPALA-8459
* IMPALA-7131 (data sources)
* getTables() doesn't return comment

Change-Id: I060f2076da74fbbe92ae26dbad51f09a3bd20169
Reviewed-on: http://gerrit.cloudera.org:8080/13122
Reviewed-by: Todd Lipcon 
Tested-by: Impala Public Jenkins 


> Cannot delete impala/kudu table if backing kudu table dropped with local 
> catalog
> 
>
> Key: IMPALA-8459
> URL: https://issues.apache.org/jira/browse/IMPALA-8459
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: kudu
>
> test_delete_external_kudu_table and test_delete_managed_kudu_table fail with 
> local catalog, e.g. with:
> {noformat}
> E   HiveServer2Error: LocalCatalogException: Error opening Kudu table 
> 'testimpalakuduintegration_1715_p3r46w.ogslbjblgv', Kudu error: the table 
> does not exist: table_name: "testimpalakuduintegration_1715_p3r46w.ogslbjblgv"
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7131) Support external data sources without catalogd

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832088#comment-16832088
 ] 

ASF subversion and git services commented on IMPALA-7131:
-

Commit 79c5f87565467074697a7d98e01c9742f7228991 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=79c5f87 ]

IMPALA-8121: part 1: some test fixes for catalog v2

This fixes some test issues encountered when running the
tests against a cluster with catalog V2 enabled, meaning
the local catalog with HMS notifications enabled. More
fixes are to come but I preferred to do them in smaller
batches as they're ready.

Test fixes:
* Detect whether catalog v2 features are enabled from web UI.
* test_describe_db waits for metadata event processor to pick up new
  database and doesn't need to change database owner
* TestWebPage.test_catalog handles an expected exception from
  the /catalog_objects page on the impalad.
* test_pull_stats_profile: feature disabled with local catalog
* test_hms_service_dies: invalidate the test table instead of
  the whole catalog.
* test_compute_stats: Avro schema resolution behaviour changed
  with local catalog - IMPALA-7308

Some remaining issues:
* IMPALA-8458
* IMPALA-8459
* IMPALA-7131 (data sources)
* getTables() doesn't return comment

Change-Id: I060f2076da74fbbe92ae26dbad51f09a3bd20169
Reviewed-on: http://gerrit.cloudera.org:8080/13122
Reviewed-by: Todd Lipcon 
Tested-by: Impala Public Jenkins 


> Support external data sources without catalogd
> --
>
> Key: IMPALA-7131
> URL: https://issues.apache.org/jira/browse/IMPALA-7131
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
>
> Currently it seems that external data sources are not persisted except in 
> memory on the catalogd. This means that it will be somewhat more difficult to 
> support this feature in the design of impalad without a catalogd.
> This JIRA is to eventually figure out a way to support this feature -- either 
> by supporting in-memory on a per-impalad basis, or perhaps by figuring out a 
> way to register them persistently in a file system directory, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7308) Support Avro tables in LocalCatalog

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832085#comment-16832085
 ] 

ASF subversion and git services commented on IMPALA-7308:
-

Commit 79c5f87565467074697a7d98e01c9742f7228991 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=79c5f87 ]

IMPALA-8121: part 1: some test fixes for catalog v2

This fixes some test issues encountered when running the
tests against a cluster with catalog V2 enabled, meaning
the local catalog with HMS notifications enabled. More
fixes are to come but I preferred to do them in smaller
batches as they're ready.

Test fixes:
* Detect whether catalog v2 features are enabled from web UI.
* test_describe_db waits for metadata event processor to pick up new
  database and doesn't need to change database owner
* TestWebPage.test_catalog handles an expected exception from
  the /catalog_objects page on the impalad.
* test_pull_stats_profile: feature disabled with local catalog
* test_hms_service_dies: invalidate the test table instead of
  the whole catalog.
* test_compute_stats: Avro schema resolution behaviour changed
  with local catalog - IMPALA-7308

Some remaining issues:
* IMPALA-8458
* IMPALA-8459
* IMPALA-7131 (data sources)
* getTables() doesn't return comment

Change-Id: I060f2076da74fbbe92ae26dbad51f09a3bd20169
Reviewed-on: http://gerrit.cloudera.org:8080/13122
Reviewed-by: Todd Lipcon 
Tested-by: Impala Public Jenkins 


> Support Avro tables in LocalCatalog
> ---
>
> Key: IMPALA-7308
> URL: https://issues.apache.org/jira/browse/IMPALA-7308
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: Impala 3.1.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7374) Impala Doc: Doc DATE type

2019-05-02 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7374:

Target Version: Impala 3.3.0

> Impala Doc: Doc DATE type
> -
>
> Key: IMPALA-7374
> URL: https://issues.apache.org/jira/browse/IMPALA-7374
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_33
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8458) Can't set numNull/maxSize/avgSize column stats with local catalog without also setting NDV

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832086#comment-16832086
 ] 

ASF subversion and git services commented on IMPALA-8458:
-

Commit 79c5f87565467074697a7d98e01c9742f7228991 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=79c5f87 ]

IMPALA-8121: part 1: some test fixes for catalog v2

This fixes some test issues encountered when running the
tests against a cluster with catalog V2 enabled, meaning
the local catalog with HMS notifications enabled. More
fixes are to come but I preferred to do them in smaller
batches as they're ready.

Test fixes:
* Detect whether catalog v2 features are enabled from web UI.
* test_describe_db waits for metadata event processor to pick up new
  database and doesn't need to change database owner
* TestWebPage.test_catalog handles an expected exception from
  the /catalog_objects page on the impalad.
* test_pull_stats_profile: feature disabled with local catalog
* test_hms_service_dies: invalidate the test table instead of
  the whole catalog.
* test_compute_stats: Avro schema resolution behaviour changed
  with local catalog - IMPALA-7308

Some remaining issues:
* IMPALA-8458
* IMPALA-8459
* IMPALA-7131 (data sources)
* getTables() doesn't return comment

Change-Id: I060f2076da74fbbe92ae26dbad51f09a3bd20169
Reviewed-on: http://gerrit.cloudera.org:8080/13122
Reviewed-by: Todd Lipcon 
Tested-by: Impala Public Jenkins 


> Can't set numNull/maxSize/avgSize column stats with local catalog without 
> also setting NDV
> --
>
> Key: IMPALA-8458
> URL: https://issues.apache.org/jira/browse/IMPALA-8458
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>
> Repro:
> {noformat}
> [tarmstrong-box2.ca.cloudera.com:21000] default> create table test_stats2(s 
> string);
> +-+
> | summary |
> +-+
> | Table has been created. |
> +-+
> Fetched 1 row(s) in 0.36s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> +++--++--+--+
> | s  | STRING | -1   | -1 | -1   | -1   |
> +++--++--+--+
> Fetched 1 row(s) in 0.02s
> [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set 
> column stats s('avgSize'='1234');
> +-+
> | summary |
> +-+
> | Updated 0 partition(s) and 1 column(s). |
> +-+
> Fetched 1 row(s) in 0.14s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> +++--++--+--+
> | s  | STRING | -1   | -1 | -1   | -1   |
> +++--++--+--+
> Fetched 1 row(s) in 0.02s
> [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set 
> column stats s('maxSize'='1234');
> +-+
> | summary |
> +-+
> | Updated 0 partition(s) and 1 column(s). |
> +-+
> Fetched 1 row(s) in 0.10s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> +++--++--+--+
> | s  | STRING | -1   | -1 | -1   | -1   |
> +++--++--+--+
> Fetched 1 row(s) in 0.02s
> [tarmstrong-box2.ca.cloudera.com:21000] default> invalidate metadata 
> test_stats2;
> Fetched 0 row(s) in 0.03s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> Query: show column stats test_stats2
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> 

[jira] [Commented] (IMPALA-8269) Clean up authorization test package structure

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832083#comment-16832083
 ] 

ASF subversion and git services commented on IMPALA-8269:
-

Commit d29d68c90d557353f7e312ba99747a6adca23820 in impala's branch 
refs/heads/master from Fredy Wijaya
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d29d68c ]

IMPALA-8269: Refactor authorization test package structure

This patch refactors authorization test package structure by moving it
to the authorization test package. This patch also renames
CustomClusterResourceAuthorizationProvider to
TestSentryResourceAuthorizationProvider since it is a class that is
specific to Sentry and is only used for testing.

Testing:
- Ran all FE tests
- Ran all E2E authorization tests

Change-Id: I525ff71f63d7c306d82b4c111f98ff327e4a07b3
Reviewed-on: http://gerrit.cloudera.org:8080/13208
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Clean up authorization test package structure
> -
>
> Key: IMPALA-8269
> URL: https://issues.apache.org/jira/browse/IMPALA-8269
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Minor
>  Labels: ramp-up
> Fix For: Impala 3.3.0
>
>
> The task is to do some clean up on the authorization test package structure.
> 1. Move AuthorizatioinTest.java and AuthorizationStmtTest.java to 
> authorization test package.
> 2. Rename CustomClusterGroupMapper and 
> CustomClusterResourceAuthorizationProvider to TestSentryGroupMapper and 
> TestSentryResourceAuthorizationProvider since those two class aren't specific 
> to custom cluster anymore.
> 3. Move those two files into `testutil` instead since they're not actually 
> test classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8121) Pick better default flags in containers

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832084#comment-16832084
 ] 

ASF subversion and git services commented on IMPALA-8121:
-

Commit 79c5f87565467074697a7d98e01c9742f7228991 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=79c5f87 ]

IMPALA-8121: part 1: some test fixes for catalog v2

This fixes some test issues encountered when running the
tests against a cluster with catalog V2 enabled, meaning
the local catalog with HMS notifications enabled. More
fixes are to come but I preferred to do them in smaller
batches as they're ready.

Test fixes:
* Detect whether catalog v2 features are enabled from web UI.
* test_describe_db waits for metadata event processor to pick up new
  database and doesn't need to change database owner
* TestWebPage.test_catalog handles an expected exception from
  the /catalog_objects page on the impalad.
* test_pull_stats_profile: feature disabled with local catalog
* test_hms_service_dies: invalidate the test table instead of
  the whole catalog.
* test_compute_stats: Avro schema resolution behaviour changed
  with local catalog - IMPALA-7308

Some remaining issues:
* IMPALA-8458
* IMPALA-8459
* IMPALA-7131 (data sources)
* getTables() doesn't return comment

Change-Id: I060f2076da74fbbe92ae26dbad51f09a3bd20169
Reviewed-on: http://gerrit.cloudera.org:8080/13122
Reviewed-by: Todd Lipcon 
Tested-by: Impala Public Jenkins 


> Pick better default flags in containers
> ---
>
> Key: IMPALA-8121
> URL: https://issues.apache.org/jira/browse/IMPALA-8121
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Infrastructure
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: docker
>
> There are some new features of Impala that are done but disabled by default 
> because they are not strictly better than the previous versions. E.g. the 
> various metadata improvements. Containerised Impala is likely to be new 
> deployments, so it is easier to make potentially disruptive changes to 
> defaults now.
> h2. Metadata V2 Flags
> Catalogd:
> --catalog_topic_mode=minimal
> Impalad:
> --use_local_catalog=true
> I think we *may* also want to configure automatic invalidations of tables 
> from the catalogd so that changes to the underlying storage cluster are 
> eventually reflected in the compute cluster. There's a better solution in the 
> pipeline that uses HMS notifications 
> (https://issues.apache.org/jira/browse/IMPALA-7970), but in the meantime 
> invalidation is time-based.
> Catalogd:
> --invalidate_tables_timeout_s=3600
> Once IMPALA-7970 goes in, we probably also want automatic invalidation by 
> default (TBD - how to handle older HMS that doesn't support those APIs).
> Catalogd:
> --hms_event_polling_interval_s=???
> We probably want to enable HDFS preads for remote reads: -use_hdfs_pread
> We may want to have an I/O cache enabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7374) Impala Doc: Doc DATE type

2019-05-02 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7374:

Labels: future_release_doc in_33  (was: future_release_doc)

> Impala Doc: Doc DATE type
> -
>
> Key: IMPALA-7374
> URL: https://issues.apache.org/jira/browse/IMPALA-7374
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_33
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7918) Remove support for authorization policy file

2019-05-02 Thread Alex Rodoni (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832082#comment-16832082
 ] 

Alex Rodoni commented on IMPALA-7918:
-

[~fredyw] [~anobis] Is it safe to remove the content about policy files from 
the Impala docs?

> Remove support for authorization policy file
> 
>
> Key: IMPALA-7918
> URL: https://issues.apache.org/jira/browse/IMPALA-7918
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Fredy Wijaya
>Assignee: Austin Nobis
>Priority: Critical
> Fix For: Impala 3.3.0
>
>
> Support for authorization policy file has been deprecated in Impala and it 
> does not work with object ownership. Furthermore, authorization policy file 
> is very specific to Sentry. Supporting authorization policy will make it 
> difficult to create a generic authorization framework in Impala. Hence, the 
> task will involve removing support for authorization policy file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8364) Impala Doc: Remove support for authorization policy file

2019-05-02 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-8364:

Labels: future_release_doc in_33  (was: future_release_doc)

> Impala Doc: Remove support for authorization policy file
> 
>
> Key: IMPALA-8364
> URL: https://issues.apache.org/jira/browse/IMPALA-8364
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Critical
>  Labels: future_release_doc, in_33
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8477) Impala Doc: Doc SHOW GRANT GROUP

2019-05-02 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-8477:

Description: https://gerrit.cloudera.org/#/c/13220/

> Impala Doc: Doc SHOW GRANT GROUP
> 
>
> Key: IMPALA-8477
> URL: https://issues.apache.org/jira/browse/IMPALA-8477
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_33
>
> https://gerrit.cloudera.org/#/c/13220/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8479) REFRESH may fail if table metadata mutates during load

2019-05-02 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8479 started by Vihang Karajgaonkar.
---
> REFRESH may fail if table metadata mutates during load
> --
>
> Key: IMPALA-8479
> URL: https://issues.apache.org/jira/browse/IMPALA-8479
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Vincent Tran
>Assignee: Vihang Karajgaonkar
>Priority: Major
>
> Reproduction:
>  1) Create a partitioned table
> {noformat}
> create table t1 (c1 int) partitioned by (part int);{noformat}
> 2) Generate a decent number of partitions
> {noformat}
> for i in {1..5000}; do impala-shell.sh -q "insert into t1 partition(part=$i) 
> values ($i)"; done
> {noformat}
> 3) Start an IM;REFRESH; loop
> {noformat}
> while :; do impala-shell.sh -q "invalidate metadata; refresh t1;"; 
> done{noformat}
>  4) Start dropping partitions in Hive.
> {noformat}
> for i in {1..5000}; do hive -e "alter table t1 drop partition (part=$i)"; 
> done{noformat}
> Eventually, when the "REFRESH" and "ALTER TABLE ... DROP ..." coincides in 
> HMS, catalogd will encounter this TableLoadingException (as appeared on trunk)
> {noformat}
> I0501 14:06:14.552676 38522 TableLoadingMgr.java:70] Loading metadata for 
> table: vt.t1
> I0501 14:06:14.552776 38927 TableLoader.java:61] Loading metadata for: vt.t1
> I0501 14:06:14.552850 38522 TableLoadingMgr.java:72] Remaining items in 
> queue: 0. Loads in progress: 1
> I0501 14:06:14.566756 38927 HdfsTable.java:941] Fetching partition metadata 
> from the Metastore: vt.t1
> I0501 14:06:16.305446 38927 HdfsTable.java:945] Fetched partition metadata 
> from the Metastore: vt.t1
> I0501 14:06:16.367607 38927 TableLoader.java:101] Loaded metadata for: vt.t1 
> (1814ms)
> I0501 14:06:16.368847 38522 jni-util.cc:256] 
> org.apache.impala.catalog.TableLoadingException: Failed to load metadata for 
> table: vt.t1
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:956)
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:877)
> at org.apache.impala.catalog.TableLoader.load(TableLoader.java:84)
> at 
> org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:241)
> at 
> org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:238)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalArgumentException: Cannot parse partition values 
> '[]' for table vt.t1: expected %d values but got %d [1, 0]
> at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:119)
> at 
> org.apache.impala.catalog.FeCatalogUtils.parsePartitionKeyValues(FeCatalogUtils.java:224)
> at 
> org.apache.impala.catalog.HdfsTable.createPartition(HdfsTable.java:698)
> at 
> org.apache.impala.catalog.HdfsTable.loadAllPartitions(HdfsTable.java:532)
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:946)
> ... 8 more
> I0501 14:06:16.440848 38522 status.cc:124] TableLoadingException: Failed to 
> load metadata for table: vt.t1
> CAUSED BY: IllegalArgumentException: Cannot parse partition values '[]' for 
> table vt.t1: expected %d values but got %d [1, 0]
> @  0x1a91ff0  impala::Status::Status()
> @  0x221518c  impala::JniUtil::GetJniExceptionMsg()
> @  0x1a7a369  impala::JniCall::Call<>()
> @  0x1a78475  impala::JniUtil::CallJniMethod<>()
> @  0x1a7663c  impala::Catalog::ResetMetadata()
> @  0x1a4e5df  CatalogServiceThriftIf::ResetMetadata()
> @  0x1aea49d  
> impala::CatalogServiceProcessor::process_ResetMetadata()
> @  0x1ae8b9b  impala::CatalogServiceProcessor::dispatchCall()
> @  0x1a36feb  apache::thrift::TDispatchProcessor::process()
> @  0x1e8e8a0  
> apache::thrift::server::TAcceptQueueServer::Task::run()
> @  0x1e8509e  impala::ThriftThread::RunRunnable()
> @  0x1e867c4  boost::_mfi::mf2<>::operator()()
> @  0x1e8665a  boost::_bi::list3<>::operator()<>()
> @  0x1e863a6  boost::_bi::bind_t<>::operator()()
> @  0x1e862b9  
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @  0x1daa4eb  boost::function0<>::operator()()
> @  0x2286cd0  impala::Thread::SuperviseThread()
> @  0x228f054  boost::_bi::list5<>::operator()<>()
> @  0x228ef78  

[jira] [Created] (IMPALA-8485) References to deprecated feature authorization policy file need to be removed

2019-05-02 Thread Austin Nobis (JIRA)
Austin Nobis created IMPALA-8485:


 Summary: References to deprecated feature authorization policy 
file need to be removed
 Key: IMPALA-8485
 URL: https://issues.apache.org/jira/browse/IMPALA-8485
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Reporter: Austin Nobis


Running the command *git grep authz-policy* produces the following output: ** 


bin/create-test-configuration.sh:generate_config authz-policy.ini.template 
authz-policy.ini
fe/.gitignore:src/test/resources/authz-policy.ini
tests/authorization/test_authorization.py:AUTH_POLICY_FILE = 
"%s/authz-policy.ini" % WAREHOUSE

These references to the *authz-policy.ini* should be cleaned up as the 
authorization policy file feature is deprecated as of *IMPALA-7918.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8485) References to deprecated feature authorization policy file need to be removed

2019-05-02 Thread Austin Nobis (JIRA)
Austin Nobis created IMPALA-8485:


 Summary: References to deprecated feature authorization policy 
file need to be removed
 Key: IMPALA-8485
 URL: https://issues.apache.org/jira/browse/IMPALA-8485
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Reporter: Austin Nobis


Running the command *git grep authz-policy* produces the following output: ** 


bin/create-test-configuration.sh:generate_config authz-policy.ini.template 
authz-policy.ini
fe/.gitignore:src/test/resources/authz-policy.ini
tests/authorization/test_authorization.py:AUTH_POLICY_FILE = 
"%s/authz-policy.ini" % WAREHOUSE

These references to the *authz-policy.ini* should be cleaned up as the 
authorization policy file feature is deprecated as of *IMPALA-7918.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8280) Implement SHOW GRANT USER

2019-05-02 Thread Alex Rodoni (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832047#comment-16832047
 ] 

Alex Rodoni commented on IMPALA-8280:
-

[~fredyw] [~anobis] SHOW GRANT USER/ROLE/GROUP ON COLUMN looks new for 3.3. Is 
it supported for Ranger and Sentry?

> Implement SHOW GRANT USER 
> 
>
> Key: IMPALA-8280
> URL: https://issues.apache.org/jira/browse/IMPALA-8280
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Reporter: Fredy Wijaya
>Assignee: Austin Nobis
>Priority: Critical
> Fix For: Impala 3.3.0
>
>
> Syntax:
> {noformat}
> SHOW GRANT USER  [ON ]
> {noformat}
> The command is to show list of privileges for a given user with an optional 
> ON clause.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8408) Failure to load metadata for .deflate compressed text files

2019-05-02 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832033#comment-16832033
 ] 

Tim Armstrong commented on IMPALA-8408:
---

For future reference, I think the crux of the fix was deleting this code here: 
https://gerrit.cloudera.org/#/c/10165/13/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java

{code}


// TODO: instead of raising an exception, we should consider marking this 
partition
// invalid and moving on, so that table loading won't fail and user can 
query other
// partitions.
for (FileDescriptor fileDescriptor: fileDescriptors_) {
  StringBuilder errorMsg = new StringBuilder();
  if 
(!getInputFormatDescriptor().getFileFormat().isFileCompressionTypeSupported(
  fileDescriptor.getFileName(), errorMsg)) {
throw new RuntimeException(errorMsg.toString());
  }
}
{code}


> Failure to load metadata for .deflate compressed text files
> ---
>
> Key: IMPALA-8408
> URL: https://issues.apache.org/jira/browse/IMPALA-8408
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.9.0, Impala 2.10.0, Impala 2.11.0, Impala 2.12.0
>Reporter: Balazs Jeszenszky
>Priority: Major
> Fix For: Impala 2.13.0
>
>
> While the metadata is loaded successfully on the catalogd side, it can't be 
> applied on the impalads:
> {code}
> I1005 14:07:25.045325 27076 Frontend.java:962] Analyzing query: describe test1
> I1005 14:07:25.045603 27076 FeSupport.java:274] Requesting prioritized load 
> of table(s): default.test1
> ...
> E1005 14:07:30.871942 19685 ImpaladCatalog.java:201] Error adding catalog 
> object: Expected compressed text file with {.lzo,.gzip,.snappy,.bz2} suffix: 
> 00_0.deflate
> Java exception follows:
> java.lang.RuntimeException: Expected compressed text file with 
> {.lzo,.gzip,.snappy,.bz2} suffix: 00_0.deflate
> at 
> org.apache.impala.catalog.HdfsPartition.(HdfsPartition.java:772)
> at 
> org.apache.impala.catalog.HdfsPartition.fromThrift(HdfsPartition.java:884)
> at 
> org.apache.impala.catalog.HdfsTable.loadFromThrift(HdfsTable.java:1678)
> at org.apache.impala.catalog.Table.fromThrift(Table.java:311)
> at 
> org.apache.impala.catalog.ImpaladCatalog.addTable(ImpaladCatalog.java:403)
> at 
> org.apache.impala.catalog.ImpaladCatalog.addCatalogObject(ImpaladCatalog.java:292)
> at 
> org.apache.impala.catalog.ImpaladCatalog.updateCatalog(ImpaladCatalog.java:199)
> at 
> org.apache.impala.service.Frontend.updateCatalogCache(Frontend.java:223)
> at 
> org.apache.impala.service.JniFrontend.updateCatalogCache(JniFrontend.java:175)
> {code}
> This results in the affected queries hanging indefinitely in planning phase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8408) Failure to load metadata for .deflate compressed text files

2019-05-02 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832037#comment-16832037
 ] 

Tim Armstrong commented on IMPALA-8408:
---

I think we also need the test added here: 
https://gerrit.cloudera.org/#/c/10165/13/tests/metadata/test_partition_metadata.py

> Failure to load metadata for .deflate compressed text files
> ---
>
> Key: IMPALA-8408
> URL: https://issues.apache.org/jira/browse/IMPALA-8408
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.9.0, Impala 2.10.0, Impala 2.11.0, Impala 2.12.0
>Reporter: Balazs Jeszenszky
>Priority: Major
> Fix For: Impala 2.13.0
>
>
> While the metadata is loaded successfully on the catalogd side, it can't be 
> applied on the impalads:
> {code}
> I1005 14:07:25.045325 27076 Frontend.java:962] Analyzing query: describe test1
> I1005 14:07:25.045603 27076 FeSupport.java:274] Requesting prioritized load 
> of table(s): default.test1
> ...
> E1005 14:07:30.871942 19685 ImpaladCatalog.java:201] Error adding catalog 
> object: Expected compressed text file with {.lzo,.gzip,.snappy,.bz2} suffix: 
> 00_0.deflate
> Java exception follows:
> java.lang.RuntimeException: Expected compressed text file with 
> {.lzo,.gzip,.snappy,.bz2} suffix: 00_0.deflate
> at 
> org.apache.impala.catalog.HdfsPartition.(HdfsPartition.java:772)
> at 
> org.apache.impala.catalog.HdfsPartition.fromThrift(HdfsPartition.java:884)
> at 
> org.apache.impala.catalog.HdfsTable.loadFromThrift(HdfsTable.java:1678)
> at org.apache.impala.catalog.Table.fromThrift(Table.java:311)
> at 
> org.apache.impala.catalog.ImpaladCatalog.addTable(ImpaladCatalog.java:403)
> at 
> org.apache.impala.catalog.ImpaladCatalog.addCatalogObject(ImpaladCatalog.java:292)
> at 
> org.apache.impala.catalog.ImpaladCatalog.updateCatalog(ImpaladCatalog.java:199)
> at 
> org.apache.impala.service.Frontend.updateCatalogCache(Frontend.java:223)
> at 
> org.apache.impala.service.JniFrontend.updateCatalogCache(JniFrontend.java:175)
> {code}
> This results in the affected queries hanging indefinitely in planning phase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8484) Add support to run queries on disjoint executor groups

2019-05-02 Thread Lars Volker (JIRA)
Lars Volker created IMPALA-8484:
---

 Summary: Add support to run queries on disjoint executor groups
 Key: IMPALA-8484
 URL: https://issues.apache.org/jira/browse/IMPALA-8484
 Project: IMPALA
  Issue Type: New Feature
Affects Versions: Impala 3.3.0
Reporter: Lars Volker






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8484) Add support to run queries on disjoint executor groups

2019-05-02 Thread Lars Volker (JIRA)
Lars Volker created IMPALA-8484:
---

 Summary: Add support to run queries on disjoint executor groups
 Key: IMPALA-8484
 URL: https://issues.apache.org/jira/browse/IMPALA-8484
 Project: IMPALA
  Issue Type: New Feature
Affects Versions: Impala 3.3.0
Reporter: Lars Volker






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-7977) Impala Doc: Doc the support for fine-grained updates at partition level

2019-05-02 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni reopened IMPALA-7977:
-

> Impala Doc: Doc the support for fine-grained updates at partition level
> ---
>
> Key: IMPALA-7977
> URL: https://issues.apache.org/jira/browse/IMPALA-7977
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7977) Impala Doc: Doc the support for fine-grained updates at partition level

2019-05-02 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7977:

Target Version: Impala 3.3.0  (was: Impala 3.2.0)

> Impala Doc: Doc the support for fine-grained updates at partition level
> ---
>
> Key: IMPALA-7977
> URL: https://issues.apache.org/jira/browse/IMPALA-7977
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7977) Impala Doc: Doc the support for fine-grained updates at partition level

2019-05-02 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7977:

Labels: future_release_doc in_32  (was: )

> Impala Doc: Doc the support for fine-grained updates at partition level
> ---
>
> Key: IMPALA-7977
> URL: https://issues.apache.org/jira/browse/IMPALA-7977
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_32
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-4736) Help string for 'minidump_path' flag does not explain SIGUSR1 behavior

2019-05-02 Thread Csaba Ringhofer (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-4736:

Fix Version/s: Impala 2.11.0

> Help string for 'minidump_path' flag does not explain SIGUSR1 behavior
> --
>
> Key: IMPALA-4736
> URL: https://issues.apache.org/jira/browse/IMPALA-4736
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Docs
>Affects Versions: Impala 2.7.0
>Reporter: Lars Volker
>Assignee: Csaba Ringhofer
>Priority: Major
>  Labels: breakpad
> Fix For: Impala 2.11.0
>
>
> The help string for 'minidump_path', which is defined in 
> {{be/src/common/global-flags.cc}} should explain that minidumps are also 
> written when sending a {{SIGUSR1}} signal to the daemons.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7971) Add support to detect insert events from Impala

2019-05-02 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-7971:
-
Fix Version/s: Impala 3.3.0

Added fix version.

> Add support to detect insert events from Impala
> ---
>
> Key: IMPALA-7971
> URL: https://issues.apache.org/jira/browse/IMPALA-7971
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> When data is inserted into existing tables and partitions, Catalog does not 
> issue any metastore API calls. Metastore provides a API called 
> {{fire_listener_event}} which can be used to add a {{INSERT_EVENT}} to the 
> metastore notification log. This event can be used by other Impala instances 
> to invalidate or update the filemetada information when data is inserted or 
> overrwriten on a given table or partition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7973) Add support for fine-grained updates at partition level

2019-05-02 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-7973:
-
Fix Version/s: Impala 3.3.0

Added fix version.

> Add support for fine-grained updates at partition level
> ---
>
> Key: IMPALA-7973
> URL: https://issues.apache.org/jira/browse/IMPALA-7973
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> When data is inserted into a partition or a new partition is created in a 
> large table, we should not be invalidating the whole table. Instead it should 
> be possible to refresh/add/drop certain partitions on the table directly 
> based on the event information. This would help with the performance of 
> subsequent access to the table by avoiding reloading the large table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6324) Support reading RLE-encoded boolean values in Parquet scanner

2019-05-02 Thread Csaba Ringhofer (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-6324:

Fix Version/s: Impala 3.0
   Impala 2.12.0

> Support reading RLE-encoded boolean values in Parquet scanner
> -
>
> Key: IMPALA-6324
> URL: https://issues.apache.org/jira/browse/IMPALA-6324
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Major
>  Labels: parquet
> Fix For: Impala 3.0, Impala 2.12.0
>
>
> Per this discussion on the Parquet mailing list, RLE will become a valid 
> encoding for the boolean type in parquet. We should add support for reading 
> this.
> https://mail-archives.apache.org/mod_mbox/parquet-dev/201712.mbox/%3CCAJPUwMDbGgkS1WmN8OvuuA%3DQ%2BXd%2BOwLn2XZAu7CNGF1sMVZMJg%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6131) Track time of last statistics update in metadata

2019-05-02 Thread Csaba Ringhofer (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-6131:

Fix Version/s: Impala 3.1.0

> Track time of last statistics update in metadata
> 
>
> Key: IMPALA-6131
> URL: https://issues.apache.org/jira/browse/IMPALA-6131
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Frontend
>Reporter: Lars Volker
>Assignee: Csaba Ringhofer
>Priority: Major
>  Labels: ramp-up
> Fix For: Impala 3.1.0
>
>
> Currently we (ab-)use {{transient_lastDdlTime}} to track the last update time 
> of statistics. Instead we should introduce a separate counter to track the 
> last update. With that we should also remove all occurrences of 
> {{catalog_.updateLastDdlTime()}} from {{CatalogOpExecutor}} and fall back to 
> Hive's default behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7971) Add support to detect insert events from Impala

2019-05-02 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832016#comment-16832016
 ] 

Tim Armstrong commented on IMPALA-7971:
---

[~anuragmantri] can you set the fix version please?

> Add support to detect insert events from Impala
> ---
>
> Key: IMPALA-7971
> URL: https://issues.apache.org/jira/browse/IMPALA-7971
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Major
>
> When data is inserted into existing tables and partitions, Catalog does not 
> issue any metastore API calls. Metastore provides a API called 
> {{fire_listener_event}} which can be used to add a {{INSERT_EVENT}} to the 
> metastore notification log. This event can be used by other Impala instances 
> to invalidate or update the filemetada information when data is inserted or 
> overrwriten on a given table or partition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7973) Add support for fine-grained updates at partition level

2019-05-02 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832014#comment-16832014
 ] 

Tim Armstrong commented on IMPALA-7973:
---

[~anuragmantri] can you set the fix version please?

> Add support for fine-grained updates at partition level
> ---
>
> Key: IMPALA-7973
> URL: https://issues.apache.org/jira/browse/IMPALA-7973
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Major
>
> When data is inserted into a partition or a new partition is created in a 
> large table, we should not be invalidating the whole table. Instead it should 
> be possible to refresh/add/drop certain partitions on the table directly 
> based on the event information. This would help with the performance of 
> subsequent access to the table by avoiding reloading the large table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7521) CLONE - Speed up sub-second unix time->TimestampValue conversions

2019-05-02 Thread Csaba Ringhofer (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-7521:

Fix Version/s: Impala 3.1.0

> CLONE - Speed up sub-second unix time->TimestampValue conversions
> -
>
> Key: IMPALA-7521
> URL: https://issues.apache.org/jira/browse/IMPALA-7521
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Major
>  Labels: performance, timestamp
> Fix For: Impala 3.1.0
>
>
> Currently Impala converts from sub-second unix time to TimestampValue (which 
> is split do date_ and time_ similarly to boost::posix_time::ptime ) by first 
> splitting the input into seconds and sub-seconds part, converting the seconds 
> part with  boost::posix_time::from_time_t(), and then adding the sub-seconds 
> part to this timestamp. This can be done much faster  by splitting the 
> sub-second input into date_ and time_ directly.
> Avoiding boost::posix_time::from_time_t() would be also nice because it can 
> only deal with timestamps from 1677 to 2262, which adds extra complexity to 
> the related code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-5051) Add support to write INT64 timestamps to the parquet writer

2019-05-02 Thread Csaba Ringhofer (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-5051:

Fix Version/s: Impala 3.3.0

> Add support to write INT64 timestamps to the parquet writer
> ---
>
> Key: IMPALA-5051
> URL: https://issues.apache.org/jira/browse/IMPALA-5051
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Lars Volker
>Assignee: Csaba Ringhofer
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> This requires updating parquet.thrift to a version that includes the 
> TIMESTAMP_MICROS logical type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-8448) Impala Doc: Doc the feature for alter_database events

2019-05-02 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-8448.
---
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Impala Doc: Doc the feature for alter_database events
> -
>
> Key: IMPALA-8448
> URL: https://issues.apache.org/jira/browse/IMPALA-8448
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_33
> Fix For: Impala 3.3.0
>
>
> https://gerrit.cloudera.org/#/c/13199/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7973) Add support for fine-grained updates at partition level

2019-05-02 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-7973.
--
Resolution: Fixed

The patch is merged.

> Add support for fine-grained updates at partition level
> ---
>
> Key: IMPALA-7973
> URL: https://issues.apache.org/jira/browse/IMPALA-7973
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Major
>
> When data is inserted into a partition or a new partition is created in a 
> large table, we should not be invalidating the whole table. Instead it should 
> be possible to refresh/add/drop certain partitions on the table directly 
> based on the event information. This would help with the performance of 
> subsequent access to the table by avoiding reloading the large table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7973) Add support for fine-grained updates at partition level

2019-05-02 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-7973.
--
Resolution: Fixed

The patch is merged.

> Add support for fine-grained updates at partition level
> ---
>
> Key: IMPALA-7973
> URL: https://issues.apache.org/jira/browse/IMPALA-7973
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Major
>
> When data is inserted into a partition or a new partition is created in a 
> large table, we should not be invalidating the whole table. Instead it should 
> be possible to refresh/add/drop certain partitions on the table directly 
> based on the event information. This would help with the performance of 
> subsequent access to the table by avoiding reloading the large table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IMPALA-8448) Impala Doc: Doc the feature for alter_database events

2019-05-02 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-8448.
---
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Impala Doc: Doc the feature for alter_database events
> -
>
> Key: IMPALA-8448
> URL: https://issues.apache.org/jira/browse/IMPALA-8448
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_33
> Fix For: Impala 3.3.0
>
>
> https://gerrit.cloudera.org/#/c/13199/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7971) Add support to detect insert events from Impala

2019-05-02 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-7971.
--
Resolution: Fixed

The patch is merged.

> Add support to detect insert events from Impala
> ---
>
> Key: IMPALA-7971
> URL: https://issues.apache.org/jira/browse/IMPALA-7971
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Major
>
> When data is inserted into existing tables and partitions, Catalog does not 
> issue any metastore API calls. Metastore provides a API called 
> {{fire_listener_event}} which can be used to add a {{INSERT_EVENT}} to the 
> metastore notification log. This event can be used by other Impala instances 
> to invalidate or update the filemetada information when data is inserted or 
> overrwriten on a given table or partition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8269) Clean up authorization test package structure

2019-05-02 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya resolved IMPALA-8269.
--
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Clean up authorization test package structure
> -
>
> Key: IMPALA-8269
> URL: https://issues.apache.org/jira/browse/IMPALA-8269
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Minor
>  Labels: ramp-up
> Fix For: Impala 3.3.0
>
>
> The task is to do some clean up on the authorization test package structure.
> 1. Move AuthorizatioinTest.java and AuthorizationStmtTest.java to 
> authorization test package.
> 2. Rename CustomClusterGroupMapper and 
> CustomClusterResourceAuthorizationProvider to TestSentryGroupMapper and 
> TestSentryResourceAuthorizationProvider since those two class aren't specific 
> to custom cluster anymore.
> 3. Move those two files into `testutil` instead since they're not actually 
> test classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-8269) Clean up authorization test package structure

2019-05-02 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya resolved IMPALA-8269.
--
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Clean up authorization test package structure
> -
>
> Key: IMPALA-8269
> URL: https://issues.apache.org/jira/browse/IMPALA-8269
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Minor
>  Labels: ramp-up
> Fix For: Impala 3.3.0
>
>
> The task is to do some clean up on the authorization test package structure.
> 1. Move AuthorizatioinTest.java and AuthorizationStmtTest.java to 
> authorization test package.
> 2. Rename CustomClusterGroupMapper and 
> CustomClusterResourceAuthorizationProvider to TestSentryGroupMapper and 
> TestSentryResourceAuthorizationProvider since those two class aren't specific 
> to custom cluster anymore.
> 3. Move those two files into `testutil` instead since they're not actually 
> test classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8448) Impala Doc: Doc the feature for alter_database events

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831933#comment-16831933
 ] 

ASF subversion and git services commented on IMPALA-8448:
-

Commit fd622a47c8ce5fc35b967f027c86ac0695ba489a in impala's branch 
refs/heads/master from Alex Rodoni
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=fd622a4 ]

IMPALA-8448: [DOCS] ALTER DATABASE event supported in metadata sync

- The alter_database event is supported in the notification-based
  metadata sync.
- Updated the version to 3.3

Change-Id: I1016c27d3f12cef71a09a895ab42fd15a54aeee1
Reviewed-on: http://gerrit.cloudera.org:8080/13199
Tested-by: Impala Public Jenkins 
Reviewed-by: Vihang Karajgaonkar 
Reviewed-by: Alex Rodoni 


> Impala Doc: Doc the feature for alter_database events
> -
>
> Key: IMPALA-8448
> URL: https://issues.apache.org/jira/browse/IMPALA-8448
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_33
>
> https://gerrit.cloudera.org/#/c/13199/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8428) Add support for caching file handles on s3

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831935#comment-16831935
 ] 

ASF subversion and git services commented on IMPALA-8428:
-

Commit ab416d42232caf1d3eea5bf3d715c4d229eae8bb in impala's branch 
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ab416d4 ]

IMPALA-8428: Bump CDH_BUILD_NUMBER to 1055188.

Brings in HADOOP-14747 (S3AInputStream to implement CanUnbuffer) which
is necessary for IMPALA-8428. The file handle cache calls `unbuffer()`
on a file handle before returning it to the cache. The call to
`unbuffer()` releases any resources that the file handle is holding onto
(in the S3A case the underlying S3ObjectInputStream is closed). Without
this fix an impalad would quickly crash as every cached file handle
would be holding onto a HTTP connection.

Testing:
* Ran an exhaustive build

Change-Id: I44b2169afff9af41d6f31312feb74ee362ecacf5
Reviewed-on: http://gerrit.cloudera.org:8080/13212
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add support for caching file handles on s3
> --
>
> Key: IMPALA-8428
> URL: https://issues.apache.org/jira/browse/IMPALA-8428
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Joe McDonnell
>Assignee: Sahil Takiar
>Priority: Critical
>
> The file handle cache is currently disabled for S3, as the S3 connector 
> needed to implement proper unbuffer support. Now that 
> https://issues.apache.org/jira/browse/HADOOP-14747 is fixed, Impala should 
> provide an option to cache S3 file handles.
> This is particularly important for data caching, as accessing the data cache 
> happens after obtaining a file handle. If getting a file handle is slow, the 
> caching will be less effective.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8428) Add support for caching file handles on s3

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831934#comment-16831934
 ] 

ASF subversion and git services commented on IMPALA-8428:
-

Commit ab416d42232caf1d3eea5bf3d715c4d229eae8bb in impala's branch 
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ab416d4 ]

IMPALA-8428: Bump CDH_BUILD_NUMBER to 1055188.

Brings in HADOOP-14747 (S3AInputStream to implement CanUnbuffer) which
is necessary for IMPALA-8428. The file handle cache calls `unbuffer()`
on a file handle before returning it to the cache. The call to
`unbuffer()` releases any resources that the file handle is holding onto
(in the S3A case the underlying S3ObjectInputStream is closed). Without
this fix an impalad would quickly crash as every cached file handle
would be holding onto a HTTP connection.

Testing:
* Ran an exhaustive build

Change-Id: I44b2169afff9af41d6f31312feb74ee362ecacf5
Reviewed-on: http://gerrit.cloudera.org:8080/13212
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add support for caching file handles on s3
> --
>
> Key: IMPALA-8428
> URL: https://issues.apache.org/jira/browse/IMPALA-8428
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Joe McDonnell
>Assignee: Sahil Takiar
>Priority: Critical
>
> The file handle cache is currently disabled for S3, as the S3 connector 
> needed to implement proper unbuffer support. Now that 
> https://issues.apache.org/jira/browse/HADOOP-14747 is fixed, Impala should 
> provide an option to cache S3 file handles.
> This is particularly important for data caching, as accessing the data cache 
> happens after obtaining a file handle. If getting a file handle is slow, the 
> caching will be less effective.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7486) Admit less memory on dedicated coordinator for admission control purposes

2019-05-02 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831834#comment-16831834
 ] 

Tim Armstrong commented on IMPALA-7486:
---

I discussed this offline with [~bikramjeet.vig] and we think that this is a 
somewhat complex issue, because query execution *can* be resource-intensive on 
the coordinator. It would be good if we can do this work incrementally rather 
than deferring it until we've done all the work to make the coordinator work 
lightweight. I think we can do it like this:

* Identify a subset of queries that we can determine are not resource-intensive 
on the coordinator, and reserve less memory for them for admission control 
purposes. Currently this might just be queries without runtime filters and with 
lightweight coordinator fragments.
* Expand that subset of queries by doing things like IMPALA-3825 and 
IMPALA-8483.

> Admit less memory on dedicated coordinator for admission control purposes
> -
>
> Key: IMPALA-7486
> URL: https://issues.apache.org/jira/browse/IMPALA-7486
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>
> Following on from IMPALA-7349, we should consider handling dedicated 
> coordinators specially rather than admitting a uniform amount of memory on 
> all backends.
> The specific scenario I'm interested in targeting is the case where we a 
> coordinator that is executing many "lightweight" coordinator fragments, e.g. 
> just an ExchangeNode and PlanRootSink, plus maybe other lightweight operators 
> like UnionNode that don't use much memory or CPU. With the current behaviour 
> it's possible for a coordinator to reach capacity from the point-of-view of 
> admission control when at runtime it is actually very lightly loaded.
> This is particularly true if coordinators and executors have different 
> process mem limits. This will be somewhat common since they're often deployed 
> on different hardware or the coordinator will have more memory dedicated to 
> its embedded JVM for the catalog cache.
> More generally we could admit different amounts per backend depending on how 
> many fragments are running, but I think this incremental step would address 
> the most important cases and be a little easier to understand.
> We may want to defer this work until we've implemented distributed runtime 
> filter aggregation, which will significantly reduce coordinator memory 
> pressure, and until we've improved distributed overadmission (since the 
> coordinator behaviour may help throttle overadmission ).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8483) Make coordinator fragment lighter-weight

2019-05-02 Thread Tim Armstrong (JIRA)
Tim Armstrong created IMPALA-8483:
-

 Summary: Make coordinator fragment lighter-weight
 Key: IMPALA-8483
 URL: https://issues.apache.org/jira/browse/IMPALA-8483
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Reporter: Tim Armstrong
Assignee: Bikramjeet Vig


it's possible to write queries with an arbitrarily expensive coordinator 
fragment. One example is a query with a set of unpartitioned analytic functions 
each with a different ORDER BY, which results in many SORT nodes in the 
coordinator fragment.

This is a problem for dedicated coordinators because it makes the resource 
consumption unpredictable.

It would be useful if we could offload that work to executors and guarantee 
that only "lightweight' operators are part of the coordinator fragment, e.g. 
operators that take O(#rows returned * log(#executors)) time and O(#executors) 
memory - regular exchanges, merging exchanges, TOP-N, non-grouping aggregation, 
etc.

I think this can be done in general by inserting an additional exchange on top 
of the "expensive" part of a coordinator fragment, and then ensuring that that 
fragment gets scheduled on an executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8483) Make coordinator fragment lighter-weight

2019-05-02 Thread Tim Armstrong (JIRA)
Tim Armstrong created IMPALA-8483:
-

 Summary: Make coordinator fragment lighter-weight
 Key: IMPALA-8483
 URL: https://issues.apache.org/jira/browse/IMPALA-8483
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Reporter: Tim Armstrong
Assignee: Bikramjeet Vig


it's possible to write queries with an arbitrarily expensive coordinator 
fragment. One example is a query with a set of unpartitioned analytic functions 
each with a different ORDER BY, which results in many SORT nodes in the 
coordinator fragment.

This is a problem for dedicated coordinators because it makes the resource 
consumption unpredictable.

It would be useful if we could offload that work to executors and guarantee 
that only "lightweight' operators are part of the coordinator fragment, e.g. 
operators that take O(#rows returned * log(#executors)) time and O(#executors) 
memory - regular exchanges, merging exchanges, TOP-N, non-grouping aggregation, 
etc.

I think this can be done in general by inserting an additional exchange on top 
of the "expensive" part of a coordinator fragment, and then ensuring that that 
fragment gets scheduled on an executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IMPALA-7140) Build out support for HDFS tables and views in LocalCatalog

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831810#comment-16831810
 ] 

ASF subversion and git services commented on IMPALA-7140:
-

Commit 65189dd6f7bfaece96a27417f591094e886109c0 in impala's branch 
refs/heads/2.x from Todd Lipcon
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=65189dd ]

IMPALA-7140 (part 7): small fixes to enable most queries on HDFS tables

This is a grab-bag of small fixes necessary to get most queries on HDFS
tables passing with the correct plans:

* Change the loading of tables to check for other table types before
  checking for FS tables, since without specific checks against
  various properties, other tables may look like FS tables and try to
  instantiate LocalFsTable incorrectly.

* Return -1 for extrapolated row count -- the previous 0 value was
  convincing the planner that it had a valid value.

* Fix up the handling of BuiltinsDb so that we don't depend on
  ImpaladCatalog to have been loaded in order to instantiate it.

* Properly handle the case where all partitions are pruned by a
  predicate.

With this change, about half of the tests in PlannerTest pass. The tests
that don't pass all rely on views, HBase tables, etc.

Conflicts:
fe/src/main/java/org/apache/impala/analysis/FunctionName.java
Need to Modify:
fe/src/main/java/org/apache/impala/analysis/CreateFunctionStmtBase.java

Change-Id: I6f603e62b7a013c148c0905ebdec2f4303f9c4e5
Reviewed-on: http://gerrit.cloudera.org:8080/10798
Tested-by: Impala Public Jenkins 
Reviewed-by: Todd Lipcon 
Reviewed-on: http://gerrit.cloudera.org:8080/13114
Reviewed-by: Tim Armstrong 


> Build out support for HDFS tables and views in LocalCatalog
> ---
>
> Key: IMPALA-7140
> URL: https://issues.apache.org/jira/browse/IMPALA-7140
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> This subtask tracks the work to build out basic read-only support for HDFS 
> tables and views in the LocalCatalog implementation:
> - loading table schemas
> - loading partitions
> - loading file information from HDFS
> This work will be broken up into a number of patches to keep each piece 
> reviewable. Once this subtask is complete we should be able to plan most 
> simple read-only queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8478) tests/authorization/test_provider.py breaks on CentOS 6 with Python 2.7-ism

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831812#comment-16831812
 ] 

ASF subversion and git services commented on IMPALA-8478:
-

Commit 909bf9320225c1e2dabb405ff591b4386548a9f4 in impala's branch 
refs/heads/master from Fredy Wijaya
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=909bf93 ]

IMPALA-8478: Fix test_provider.py Python 2.6 compatibility

This patch updates test_provider.py to work on Python 2.6 by using a
string formatting syntax that's available on both Python 2.6 and 2.7.

Testing:
- Ran test_provider.py on both Python 2.6 and 2.7

Change-Id: Ie7aa4a3149ae07261ecb64e84ba4fad5dd63f131
Reviewed-on: http://gerrit.cloudera.org:8080/13211
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> tests/authorization/test_provider.py breaks on CentOS 6 with Python 2.7-ism
> ---
>
> Key: IMPALA-8478
> URL: https://issues.apache.org/jira/browse/IMPALA-8478
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Laszlo Gaal
>Assignee: Fredy Wijaya
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.3.0
>
>
> tests/authorization/test_provider.py contains a format string variant not 
> available on Python 2.6.x, which is the default Python version on CentOS 6. 
> This breaks the test on CentOS 6:
> {code}
> 20:31:53 === short test summary info 
> 
> 20:31:53 XFAIL 
> custom_cluster/test_alloc_fail.py::TestAllocFail::()::test_alloc_fail_update[protocol:
>  beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> text/none]
> 20:31:53   IMPALA-2925: the execution is not deterministic so some tests 
> sometimes don't fail as expected
> 20:31:53 FAIL 
> authorization/test_provider.py::TestAuthorizationProvider::()::test_invalid_provider_flag
> 20:31:53 === FAILURES 
> ===
> 20:31:53 _ TestAuthorizationProvider.test_invalid_provider_flag 
> _
> 20:31:53 authorization/test_provider.py:59: in test_invalid_provider_flag
> 20:31:53 .format(TestAuthorizationProvider.BAD_FLAG))
> 20:31:53 E   ValueError: zero length field name in format
> 20:31:53  Captured stderr setup 
> -
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8481) test_hbase_col_filter failing on deployed clusters due to permissions error

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831813#comment-16831813
 ] 

ASF subversion and git services commented on IMPALA-8481:
-

Commit f22445fdb266ea6e835892d9a0885b283c84ca0c in impala's branch 
refs/heads/master from David Knupp
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=f22445f ]

IMPALA-8481: Pass username in test_hbase_col_filter

This test will fail if run against a deployed cluster, since the
default using running the test may not have the correct permissions.

Confirmed fix by running the test on a local minicluster build, as
well as on the deployed cluster.

Change-Id: Ib9f6c51b8b30087c56c0499923604e1484239468
Reviewed-on: http://gerrit.cloudera.org:8080/13214
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> test_hbase_col_filter failing on deployed clusters due to permissions error
> ---
>
> Key: IMPALA-8481
> URL: https://issues.apache.org/jira/browse/IMPALA-8481
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.3.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Critical
> Fix For: Impala 3.3.0
>
>
> When running test_hbase_queries against a deployed cluster, the default user 
> on the machine running the tests may not have the correct access permission 
> on the cluster, which causes this test to fail.
> {noformat}
> query_test/test_hbase_queries.py:89: in test_hbase_col_filter
> self.run_stmt_in_hive(add_data)
> common/impala_test_suite.py:800: in run_stmt_in_hive
> raise RuntimeError(stderr)
> [...]
> E   INFO  : Query ID = 
> hive_20190501001622_fa3a9f39-7d32-49da-ba1d-084911730a2f
> E   INFO  : Total jobs = 1
> E   INFO  : Starting task [Stage-0:DDL] in serial mode
> E   INFO  : Launching Job 1 out of 1
> E   INFO  : Starting task [Stage-1:MAPRED] in serial mode
> E   INFO  : Number of reduce tasks is set to 0 since there's no reduce 
> operator
> E   ERROR : Job Submission failed with exception 
> 'org.apache.hadoop.security.AccessControlException(Permission denied: 
> user=jenkins, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x...)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7892) Explicit cast state lost for numeric literals on rewrite

2019-05-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831811#comment-16831811
 ] 

ASF subversion and git services commented on IMPALA-7892:
-

Commit e01fc42f16a6a2d6f8e7f8c8e89becb6ad9aa739 in impala's branch 
refs/heads/master from Alex Rodoni
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e01fc42 ]

[DOCS] Fixed the typo introduced in IMPALA-7892

Change-Id: I46bc34a179a5996f1121a5906255d4906e91ce0c
Reviewed-on: http://gerrit.cloudera.org:8080/13209
Reviewed-by: Alex Rodoni 
Tested-by: Impala Public Jenkins 


> Explicit cast state lost for numeric literals on rewrite
> 
>
> Key: IMPALA-7892
> URL: https://issues.apache.org/jira/browse/IMPALA-7892
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> Consider the following SQL:
> {code:sql}
> SELECT CAST(1 AS INT) as C FROM alltypestiny
> {code}
> The expression contains an explicit cast that cannot be optimized away during 
> rewrites. The cast tells us that the user wants the value 1 as an {{INT}}, 
> not as its “natural” type as {{TINYINT}}.
> The NumericLiteral type has an “explicit cast” flag that is set sometimes. 
> However, the expression rewriter does not set it:
> {code:java}
>   String query = "SELECT CAST(1 AS INT) AS c" +
>   " from functional.alltypestiny";
>   AnalysisContext ctx = createAnalysisCtx();
>   ctx.getQueryOptions().setEnable_expr_rewrites(true);
>   SelectStmt select = (SelectStmt) AnalyzesOk(query, ctx);
>   Expr expr = select.getSelectList().getItems().get(0).getExpr();
>   assertEquals(ScalarType.INT, expr.getType());
>   assertTrue(expr instanceof NumericLiteral);
>   assertTrue(((NumericLiteral) expr).isExplicitCast());
> {code}
> The last test fails because the NumericLiteral is not marked as explicitly 
> cast.
> If the analyzer were to "reset" analysis for the node, we would lose the fact 
> that the user wanted an {{INT}}, and the type would revert to {{TINYINT}}, 
> causing instability, a small change in the analyzer can produce a different 
> query result.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8430) testCreateDropCreateDatabaseFromImpala failed

2019-05-02 Thread Attila Jeges (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-8430:
-
Fix Version/s: Impala 3.3.0

> testCreateDropCreateDatabaseFromImpala failed
> -
>
> Key: IMPALA-8430
> URL: https://issues.apache.org/jira/browse/IMPALA-8430
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Attila Jeges
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.3.0
>
>
> FE test  MetastoreEventsProcessorTest.testCreateDropCreateDatabaseFromImpala 
> failed with the following error:
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertNotNull(Assert.java:712)
>   at org.junit.Assert.assertNotNull(Assert.java:722)
>   at 
> org.apache.impala.catalog.events.MetastoreEventsProcessorTest.testCreateDropCreateDatabaseFromImpala(MetastoreEventsProcessorTest.java:863)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:272)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:236)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:386)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:323)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:143)
> {code}
> This might be just a temp problem as the subsequent test run succeeded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8432) ASF-master data load job fails with "Could not resolve dependencies for project org.apache.impala:impala-frontend"

2019-05-02 Thread Attila Jeges (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-8432:
-
Fix Version/s: (was: Impala 3.3.0)

> ASF-master data load job fails with "Could not resolve dependencies for 
> project org.apache.impala:impala-frontend"
> --
>
> Key: IMPALA-8432
> URL: https://issues.apache.org/jira/browse/IMPALA-8432
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: Attila Jeges
>Assignee: Fredy Wijaya
>Priority: Blocker
>  Labels: broken-build
>
> ASF-master core data load job fails with following:
> {code}
> 10:14:04 [INFO] BUILD FAILURE
> 10:14:04 [ERROR] Failed to execute goal on project impala-frontend: Could not 
> resolve dependencies for project 
> org.apache.impala:impala-frontend:jar:0.1-SNAPSHOT: The following artifacts 
> could not be resolved: 
> org.apache.ranger:ranger-plugins-common:jar:1.2.0.6.0.99.0-45, 
> org.apache.ranger:ranger-plugins-audit:jar:1.2.0.6.0.99.0-45: Could not find 
> artifact org.apache.ranger:ranger-plugins-common:jar:1.2.0.6.0.99.0-45 in 
> cloudera-mirrors 
> (http://maven.jenkins.cloudera.com:8081/artifactory/cloudera-mirrors) -> 
> [Help 1]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8432) ASF-master data load job fails with "Could not resolve dependencies for project org.apache.impala:impala-frontend"

2019-05-02 Thread Attila Jeges (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-8432:
-
Fix Version/s: Impala 3.3.0

> ASF-master data load job fails with "Could not resolve dependencies for 
> project org.apache.impala:impala-frontend"
> --
>
> Key: IMPALA-8432
> URL: https://issues.apache.org/jira/browse/IMPALA-8432
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: Attila Jeges
>Assignee: Fredy Wijaya
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.3.0
>
>
> ASF-master core data load job fails with following:
> {code}
> 10:14:04 [INFO] BUILD FAILURE
> 10:14:04 [ERROR] Failed to execute goal on project impala-frontend: Could not 
> resolve dependencies for project 
> org.apache.impala:impala-frontend:jar:0.1-SNAPSHOT: The following artifacts 
> could not be resolved: 
> org.apache.ranger:ranger-plugins-common:jar:1.2.0.6.0.99.0-45, 
> org.apache.ranger:ranger-plugins-audit:jar:1.2.0.6.0.99.0-45: Could not find 
> artifact org.apache.ranger:ranger-plugins-common:jar:1.2.0.6.0.99.0-45 in 
> cloudera-mirrors 
> (http://maven.jenkins.cloudera.com:8081/artifactory/cloudera-mirrors) -> 
> [Help 1]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8482) Include all ranger-audit-plugins runtime dependencies

2019-05-02 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8482 started by Fredy Wijaya.

> Include all ranger-audit-plugins runtime dependencies
> -
>
> Key: IMPALA-8482
> URL: https://issues.apache.org/jira/browse/IMPALA-8482
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Infrastructure
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Critical
>
> Impala needs to package ranger-audit-plugins runtime dependencies so that it 
> ranger-audit-plugins works as expected against various audit providers, e.g. 
> solr, kafka, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8482) Include all ranger-audit-plugins runtime dependencies

2019-05-02 Thread Fredy Wijaya (JIRA)
Fredy Wijaya created IMPALA-8482:


 Summary: Include all ranger-audit-plugins runtime dependencies
 Key: IMPALA-8482
 URL: https://issues.apache.org/jira/browse/IMPALA-8482
 Project: IMPALA
  Issue Type: Sub-task
  Components: Infrastructure
Reporter: Fredy Wijaya
Assignee: Fredy Wijaya


Impala needs to package ranger-audit-plugins runtime dependencies so that it 
ranger-audit-plugins works as expected against various audit providers, e.g. 
solr, kafka, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5073) Considering bypassing TCMalloc by default for buffer pool

2019-05-02 Thread Ruslan Dautkhanov (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831722#comment-16831722
 ] 

Ruslan Dautkhanov commented on IMPALA-5073:
---

sorry for the noise on some of these jiras.. it's just some feedback after 
working with Red Hat support on some of performance issue we're seeing. I can 
share more details from Red Hat if you guys are interested .thanks for any 
inputs. 

> Considering bypassing TCMalloc by default for buffer pool
> -
>
> Key: IMPALA-5073
> URL: https://issues.apache.org/jira/browse/IMPALA-5073
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Tim Armstrong
>Priority: Minor
>  Labels: resource-management
>
> There would be some advantages to switch from from allocating buffers via 
> TCMalloc and instead using mmap directly - e.g. less contention for the page 
> heap lock.
> There are also downsides - virtual memory consumption could increase and we 
> may end up mapping and unmapping memory more frequently.
> We would also need to wire up the MemTrackers so they include this memory in 
> the process estimate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5073) Considering bypassing TCMalloc by default for buffer pool

2019-05-02 Thread Ruslan Dautkhanov (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831718#comment-16831718
 ] 

Ruslan Dautkhanov commented on IMPALA-5073:
---

https://github.com/cloudera/Impala/blob/cdh5-2.11.0_5.14.4/be/src/runtime/bufferpool/system-allocator.cc#L81

mmap() call should have MAP_HUGETLB in 3rd argument isn't it? for huge page to 
be used. That call currently doesn't have it, so Impala's mmap()ed memory never 
uses huge pages?

Like it is described in 
https://d3s.mff.cuni.cz/legacy/teaching/advanced_operating_systems/slides/10_huge_pages.pdf
 for example

Thanks! 

 

> Considering bypassing TCMalloc by default for buffer pool
> -
>
> Key: IMPALA-5073
> URL: https://issues.apache.org/jira/browse/IMPALA-5073
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Tim Armstrong
>Priority: Minor
>  Labels: resource-management
>
> There would be some advantages to switch from from allocating buffers via 
> TCMalloc and instead using mmap directly - e.g. less contention for the page 
> heap lock.
> There are also downsides - virtual memory consumption could increase and we 
> may end up mapping and unmapping memory more frequently.
> We would also need to wire up the MemTrackers so they include this memory in 
> the process estimate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5073) Considering bypassing TCMalloc by default for buffer pool

2019-05-02 Thread Ruslan Dautkhanov (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831716#comment-16831716
 ] 

Ruslan Dautkhanov commented on IMPALA-5073:
---

Also somewhat related - it's interesting that how it's done in JVM with 
pretouches to move allocation cost to startup phase 

[https://shipilev.net/jvm/anatomy-quarks/2-transparent-huge-pages/]

{quote}
 To shift these costs to the JVM startup that will avoid surprising latency 
hiccups when application is running, you may instruct JVM to touch every single 
page in Java heap with -XX:+AlwaysPreTouch during initialization. It is a good 
idea to enable pre-touch for larger heaps anyway.

And there comes the funny part: enabling -XX:+UseTransparentHugePages actually 
makes -XX:+AlwaysPreTouch faster, because JVM now knows it has to touch the 
heap in larger quanta (say, a byte every 2M), rather than in smaller ones (say, 
a byte every 4K). 
{quote}


> Considering bypassing TCMalloc by default for buffer pool
> -
>
> Key: IMPALA-5073
> URL: https://issues.apache.org/jira/browse/IMPALA-5073
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Tim Armstrong
>Priority: Minor
>  Labels: resource-management
>
> There would be some advantages to switch from from allocating buffers via 
> TCMalloc and instead using mmap directly - e.g. less contention for the page 
> heap lock.
> There are also downsides - virtual memory consumption could increase and we 
> may end up mapping and unmapping memory more frequently.
> We would also need to wire up the MemTrackers so they include this memory in 
> the process estimate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5073) Considering bypassing TCMalloc by default for buffer pool

2019-05-02 Thread Ruslan Dautkhanov (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831702#comment-16831702
 ] 

Ruslan Dautkhanov commented on IMPALA-5073:
---

[~tarmstrong]

My 2 cents. Would it be possible to preallocate a fixed amount of memory for 
impalad at startup?

Like it's done in Oracle through `SGA_AGGREGATE_TARGET` (global shared buffers, 
cached query plans etc etc) and `PGA_AGGREGATE_TARGET` (for per-session memory 
demand like for sorting temp results etc). 

This increases startup time, which is okay, but there is no overhead to mmap() 
on a per-session / per-query level. 

For Oracle SGA always dominates (global db block buffer caches etc), but for 
Impala PGA-like memory structure would dominate I guess (per-query shuffling 
results etc) . 

Another advantage of preallocating memory is that you could use hugetlbfs 
optionally as amount of memory for that preallocated memory is static and known 
ahead of time. 

 

> Considering bypassing TCMalloc by default for buffer pool
> -
>
> Key: IMPALA-5073
> URL: https://issues.apache.org/jira/browse/IMPALA-5073
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Tim Armstrong
>Priority: Minor
>  Labels: resource-management
>
> There would be some advantages to switch from from allocating buffers via 
> TCMalloc and instead using mmap directly - e.g. less contention for the page 
> heap lock.
> There are also downsides - virtual memory consumption could increase and we 
> may end up mapping and unmapping memory more frequently.
> We would also need to wire up the MemTrackers so they include this memory in 
> the process estimate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8478) tests/authorization/test_provider.py breaks on CentOS 6 with Python 2.7-ism

2019-05-02 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya resolved IMPALA-8478.
--
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> tests/authorization/test_provider.py breaks on CentOS 6 with Python 2.7-ism
> ---
>
> Key: IMPALA-8478
> URL: https://issues.apache.org/jira/browse/IMPALA-8478
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Laszlo Gaal
>Assignee: Fredy Wijaya
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.3.0
>
>
> tests/authorization/test_provider.py contains a format string variant not 
> available on Python 2.6.x, which is the default Python version on CentOS 6. 
> This breaks the test on CentOS 6:
> {code}
> 20:31:53 === short test summary info 
> 
> 20:31:53 XFAIL 
> custom_cluster/test_alloc_fail.py::TestAllocFail::()::test_alloc_fail_update[protocol:
>  beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> text/none]
> 20:31:53   IMPALA-2925: the execution is not deterministic so some tests 
> sometimes don't fail as expected
> 20:31:53 FAIL 
> authorization/test_provider.py::TestAuthorizationProvider::()::test_invalid_provider_flag
> 20:31:53 === FAILURES 
> ===
> 20:31:53 _ TestAuthorizationProvider.test_invalid_provider_flag 
> _
> 20:31:53 authorization/test_provider.py:59: in test_invalid_provider_flag
> 20:31:53 .format(TestAuthorizationProvider.BAD_FLAG))
> 20:31:53 E   ValueError: zero length field name in format
> 20:31:53  Captured stderr setup 
> -
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8478) tests/authorization/test_provider.py breaks on CentOS 6 with Python 2.7-ism

2019-05-02 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya resolved IMPALA-8478.
--
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> tests/authorization/test_provider.py breaks on CentOS 6 with Python 2.7-ism
> ---
>
> Key: IMPALA-8478
> URL: https://issues.apache.org/jira/browse/IMPALA-8478
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Laszlo Gaal
>Assignee: Fredy Wijaya
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.3.0
>
>
> tests/authorization/test_provider.py contains a format string variant not 
> available on Python 2.6.x, which is the default Python version on CentOS 6. 
> This breaks the test on CentOS 6:
> {code}
> 20:31:53 === short test summary info 
> 
> 20:31:53 XFAIL 
> custom_cluster/test_alloc_fail.py::TestAllocFail::()::test_alloc_fail_update[protocol:
>  beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> text/none]
> 20:31:53   IMPALA-2925: the execution is not deterministic so some tests 
> sometimes don't fail as expected
> 20:31:53 FAIL 
> authorization/test_provider.py::TestAuthorizationProvider::()::test_invalid_provider_flag
> 20:31:53 === FAILURES 
> ===
> 20:31:53 _ TestAuthorizationProvider.test_invalid_provider_flag 
> _
> 20:31:53 authorization/test_provider.py:59: in test_invalid_provider_flag
> 20:31:53 .format(TestAuthorizationProvider.BAD_FLAG))
> 20:31:53 E   ValueError: zero length field name in format
> 20:31:53  Captured stderr setup 
> -
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IMPALA-8481) test_hbase_col_filter failing on deployed clusters due to permissions error

2019-05-02 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp updated IMPALA-8481:

Affects Version/s: Impala 3.3.0

> test_hbase_col_filter failing on deployed clusters due to permissions error
> ---
>
> Key: IMPALA-8481
> URL: https://issues.apache.org/jira/browse/IMPALA-8481
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.3.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Critical
> Fix For: Impala 3.3.0
>
>
> When running test_hbase_queries against a deployed cluster, the default user 
> on the machine running the tests may not have the correct access permission 
> on the cluster, which causes this test to fail.
> {noformat}
> query_test/test_hbase_queries.py:89: in test_hbase_col_filter
> self.run_stmt_in_hive(add_data)
> common/impala_test_suite.py:800: in run_stmt_in_hive
> raise RuntimeError(stderr)
> [...]
> E   INFO  : Query ID = 
> hive_20190501001622_fa3a9f39-7d32-49da-ba1d-084911730a2f
> E   INFO  : Total jobs = 1
> E   INFO  : Starting task [Stage-0:DDL] in serial mode
> E   INFO  : Launching Job 1 out of 1
> E   INFO  : Starting task [Stage-1:MAPRED] in serial mode
> E   INFO  : Number of reduce tasks is set to 0 since there's no reduce 
> operator
> E   ERROR : Job Submission failed with exception 
> 'org.apache.hadoop.security.AccessControlException(Permission denied: 
> user=jenkins, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x...)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8481) test_hbase_col_filter failing on deployed clusters due to permissions error

2019-05-02 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp reassigned IMPALA-8481:
---

Assignee: David Knupp

> test_hbase_col_filter failing on deployed clusters due to permissions error
> ---
>
> Key: IMPALA-8481
> URL: https://issues.apache.org/jira/browse/IMPALA-8481
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Critical
> Fix For: Impala 3.3.0
>
>
> When running test_hbase_queries against a deployed cluster, the default user 
> on the machine running the tests may not have the correct access permission 
> on the cluster, which causes this test to fail.
> {noformat}
> query_test/test_hbase_queries.py:89: in test_hbase_col_filter
> self.run_stmt_in_hive(add_data)
> common/impala_test_suite.py:800: in run_stmt_in_hive
> raise RuntimeError(stderr)
> [...]
> E   INFO  : Query ID = 
> hive_20190501001622_fa3a9f39-7d32-49da-ba1d-084911730a2f
> E   INFO  : Total jobs = 1
> E   INFO  : Starting task [Stage-0:DDL] in serial mode
> E   INFO  : Launching Job 1 out of 1
> E   INFO  : Starting task [Stage-1:MAPRED] in serial mode
> E   INFO  : Number of reduce tasks is set to 0 since there's no reduce 
> operator
> E   ERROR : Job Submission failed with exception 
> 'org.apache.hadoop.security.AccessControlException(Permission denied: 
> user=jenkins, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x...)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8481) test_hbase_col_filter failing on deployed clusters due to permissions error

2019-05-02 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-8481.
-
Resolution: Fixed

> test_hbase_col_filter failing on deployed clusters due to permissions error
> ---
>
> Key: IMPALA-8481
> URL: https://issues.apache.org/jira/browse/IMPALA-8481
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: David Knupp
>Priority: Critical
> Fix For: Impala 3.3.0
>
>
> When running test_hbase_queries against a deployed cluster, the default user 
> on the machine running the tests may not have the correct access permission 
> on the cluster, which causes this test to fail.
> {noformat}
> query_test/test_hbase_queries.py:89: in test_hbase_col_filter
> self.run_stmt_in_hive(add_data)
> common/impala_test_suite.py:800: in run_stmt_in_hive
> raise RuntimeError(stderr)
> [...]
> E   INFO  : Query ID = 
> hive_20190501001622_fa3a9f39-7d32-49da-ba1d-084911730a2f
> E   INFO  : Total jobs = 1
> E   INFO  : Starting task [Stage-0:DDL] in serial mode
> E   INFO  : Launching Job 1 out of 1
> E   INFO  : Starting task [Stage-1:MAPRED] in serial mode
> E   INFO  : Number of reduce tasks is set to 0 since there's no reduce 
> operator
> E   ERROR : Job Submission failed with exception 
> 'org.apache.hadoop.security.AccessControlException(Permission denied: 
> user=jenkins, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x...)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-8481) test_hbase_col_filter failing on deployed clusters due to permissions error

2019-05-02 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-8481.
-
Resolution: Fixed

> test_hbase_col_filter failing on deployed clusters due to permissions error
> ---
>
> Key: IMPALA-8481
> URL: https://issues.apache.org/jira/browse/IMPALA-8481
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: David Knupp
>Priority: Critical
> Fix For: Impala 3.3.0
>
>
> When running test_hbase_queries against a deployed cluster, the default user 
> on the machine running the tests may not have the correct access permission 
> on the cluster, which causes this test to fail.
> {noformat}
> query_test/test_hbase_queries.py:89: in test_hbase_col_filter
> self.run_stmt_in_hive(add_data)
> common/impala_test_suite.py:800: in run_stmt_in_hive
> raise RuntimeError(stderr)
> [...]
> E   INFO  : Query ID = 
> hive_20190501001622_fa3a9f39-7d32-49da-ba1d-084911730a2f
> E   INFO  : Total jobs = 1
> E   INFO  : Starting task [Stage-0:DDL] in serial mode
> E   INFO  : Launching Job 1 out of 1
> E   INFO  : Starting task [Stage-1:MAPRED] in serial mode
> E   INFO  : Number of reduce tasks is set to 0 since there's no reduce 
> operator
> E   ERROR : Job Submission failed with exception 
> 'org.apache.hadoop.security.AccessControlException(Permission denied: 
> user=jenkins, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x...)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-05-02 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8473:
---
Description: 
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
h3. Design Proposal

Move lineage logging from be to fe, where we can make use of the same plugin 
approach as {{authorization_provider}} to allow a downstream user to provide 
their own lineage consumers as runtime dependencies.

[~mad...@apache.org] has provided a fe patch (attached) with suggested 
mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
would be invoked from the be at appropriate places, e.g. 
[https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
  The hooks should all be executed asynchronously, so the current thinking is 
that this execution should happen in the fe, since the be does not know about 
what hooks are registered.  IOW, the {{ImpalaPostExecHookFactory.executeHooks}} 
method (see patch) should probably make use of a thread-pool executor service 
(or something similar) in order to execute all hooks in parallel and in a 
non-blocking manner, returning to the be asap.

  was:
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.

Approach should be similar to that of choosing authorization provider, where 
the publication strategy can be chosen at runtime via configuration flag(s).

Scope of this ticket is to move lineage publication to the fe and add 
appropriate hooks that a user can implement.

[~mad...@apache.org] has provided a fe patch (attached) with suggested 
mechanism for allowing multiple hooks to be registered with a singleton.  
Singleton would be invoked from the be at appropriate places, e.g. 
[https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
  The hooks should all be executed asynchronously, so the current thinking is 
that this execution should happen in the fe, since the be does not know about 
what hooks are registered.  IOW, the {{ImpalaPostExecHookFactory.executeHooks}} 
method (see patch) should probably make use of a thread-pool executor service 
(or something similar) in order to execute all hooks in parallel and in a 
non-blocking manner, returning to the be asap.


> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Critical
> Attachments: ImpalaPostExecHook-infra.patch
>
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> h3. Design Proposal
> Move lineage logging from be to fe, where we can make use of the same plugin 
> approach as {{authorization_provider}} to allow a downstream user to provide 
> their own lineage consumers as runtime dependencies.
> [~mad...@apache.org] has provided a fe patch (attached) with suggested 
> mechanism for allowing multiple hooks to be registered with the fe.  Hooks 
> would be invoked from the be at appropriate places, e.g. 
> [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
>   The hooks should all be executed asynchronously, so the current thinking is 
> that this execution should happen in the fe, since the be does not know about 
> what hooks are registered.  IOW, the 
> {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably 
> make use of a thread-pool executor service (or something similar) in order to 
> execute all hooks in parallel and in a non-blocking manner, returning to the 
> be asap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-05-02 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8473:
---
Attachment: ImpalaPostExecHook-infra.patch

> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Critical
> Attachments: ImpalaPostExecHook-infra.patch
>
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> Approach should be similar to that of choosing authorization provider, where 
> the publication strategy can be chosen at runtime via configuration flag(s).
> Scope of this ticket is to move lineage publication to the fe and add 
> appropriate hooks that a user can implement.
> [~mad...@apache.org] has provided a fe patch (attached) with suggested 
> mechanism for allowing multiple hooks to be registered with a singleton.  
> Singleton would be invoked from the be at appropriate places, e.g. 
> [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
>   The hooks should all be executed asynchronously, so the current thinking is 
> that this execution should happen in the fe, since the be does not know about 
> what hooks are registered.  IOW, the 
> {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably 
> make use of a thread-pool executor service (or something similar) in order to 
> execute all hooks in parallel and in a non-blocking manner, returning to the 
> be asap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-05-02 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen updated IMPALA-8473:
---
Description: 
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.

Approach should be similar to that of choosing authorization provider, where 
the publication strategy can be chosen at runtime via configuration flag(s).

Scope of this ticket is to move lineage publication to the fe and add 
appropriate hooks that a user can implement.

[~mad...@apache.org] has provided a fe patch (attached) with suggested 
mechanism for allowing multiple hooks to be registered with a singleton.  
Singleton would be invoked from the be at appropriate places, e.g. 
[https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
  The hooks should all be executed asynchronously, so the current thinking is 
that this execution should happen in the fe, since the be does not know about 
what hooks are registered.  IOW, the {{ImpalaPostExecHookFactory.executeHooks}} 
method (see patch) should probably make use of a thread-pool executor service 
(or something similar) in order to execute all hooks in parallel and in a 
non-blocking manner, returning to the be asap.

  was:
Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.

Approach should be similar to that of choosing authorization provider, where 
the publication strategy can be chosen at runtime via configuration flag(s).

Scope of this ticket is to move lineage publication to the fe and add 
appropriate hooks that a user can implement


> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Critical
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> Approach should be similar to that of choosing authorization provider, where 
> the publication strategy can be chosen at runtime via configuration flag(s).
> Scope of this ticket is to move lineage publication to the fe and add 
> appropriate hooks that a user can implement.
> [~mad...@apache.org] has provided a fe patch (attached) with suggested 
> mechanism for allowing multiple hooks to be registered with a singleton.  
> Singleton would be invoked from the be at appropriate places, e.g. 
> [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].
>   The hooks should all be executed asynchronously, so the current thinking is 
> that this execution should happen in the fe, since the be does not know about 
> what hooks are registered.  IOW, the 
> {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably 
> make use of a thread-pool executor service (or something similar) in order to 
> execute all hooks in parallel and in a non-blocking manner, returning to the 
> be asap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

2019-05-02 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8473 started by radford nguyen.
--
> Refactor lineage publication mechanism to allow for different consumers
> ---
>
> Key: IMPALA-8473
> URL: https://issues.apache.org/jira/browse/IMPALA-8473
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Frontend
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Critical
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> Approach should be similar to that of choosing authorization provider, where 
> the publication strategy can be chosen at runtime via configuration flag(s).
> Scope of this ticket is to move lineage publication to the fe and add 
> appropriate hooks that a user can implement



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org