[Impala-ASF-CR] IMPALA-3552: Make incremental stats max serialized size configurable
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3552: Make incremental stats max serialized size configurable .. Patch Set 5: (1 comment) agreed w/ bharath. Types.thrift is not good place for tbackendconfig. please propose other place. unless, let's live w/ it for now. http://gerrit.cloudera.org:8080/#/c/4867/2/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: PS2, Line 73: ABORT_IF_ERROR(GetCatalogConfigBytes(jni_env, &cfg_bytes)); : : jobject catalog = jni_env->NewObject(catalog_class_, catalog_ctor_, cfg_bytes); : EXIT_IF_EXC(jni_env); : ABORT_IF_ERROR(JniUtil::LocalToGlobalRef(jni_env, catalog, &catalog_)); : } : : Status Catalog::GetCatalogObject(const TCatalogObject& req, : TCatalogObject* resp) { : return JniUtil::CallJniMethod(catalog_, get_cata > Can you factor all this into a method, GetBackendConfig() or something simi Done -- To view, visit http://gerrit.cloudera.org:8080/4867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3552: Make incremental stats max serialized size configurable
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3552: Make incremental stats max serialized size configurable .. Patch Set 4: (10 comments) http://gerrit.cloudera.org:8080/#/c/4867/4//COMMIT_MSG Commit Message: PS4, Line 21: , set of : gflags, to frontend, JniCatalog and JniFrontend > to Catalog and Frontend. Done PS4, Line 23: entertain > use? Done http://gerrit.cloudera.org:8080/#/c/4867/4/be/src/catalog/catalog.h File be/src/catalog/catalog.h: Line 104: Status GetCatalogConfigBytes(JNIEnv* jni_env, jbyteArray& cfg_bytes) const; > nit: We generally use the return objects to be pointers and pass others by Done http://gerrit.cloudera.org:8080/#/c/4867/4/be/src/service/frontend.h File be/src/service/frontend.h: Line 176: Status GetFrontendConfigBytes(JNIEnv* jni_env, jbyteArray& cfg_bytes) const; > nit: same as the comment in catalog.h Done http://gerrit.cloudera.org:8080/#/c/4867/4/fe/src/main/java/org/apache/impala/common/RuntimeEnv.java File fe/src/main/java/org/apache/impala/common/RuntimeEnv.java: Line 39: private boolean computeLineage_; > Remove. Done PS4, Line 46: computeLineage_ = BackendConfig.INSTANCE.getComputeLineage(); > Remove. Update the callers to use BackendConfig. Done Line 60: public boolean computeLineage() { return computeLineage_; } > Remove. Done http://gerrit.cloudera.org:8080/#/c/4867/4/fe/src/main/java/org/apache/impala/service/BackendConfig.java File fe/src/main/java/org/apache/impala/service/BackendConfig.java: Line 38: private boolean allowAuthToLocalRules_ = false; > Remove. Done Line 45: private long incStatMaxSize_ = 0; > Directly have a reference to TBackendConfig object here. Done PS4, Line 63: public long getIncStatMaxSize() { return incStatMaxSize_; } : public boolean isAuthToLocalEnabled() { return allowAuthToLocalRule > return tBackendConfigObj. directly. No need to have separate variab Done -- To view, visit http://gerrit.cloudera.org:8080/4867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3552: Make incremental stats max serialized size configurable
Yonghyun Hwang has uploaded a new patch set (#5). Change subject: IMPALA-3552: Make incremental stats max serialized size configurable .. IMPALA-3552: Make incremental stats max serialized size configurable The fix "IMPALA-2648/IMPALA-2664" introduced a conservative limitation on the maximum serialized size of incremental stats. As a side effect, some users with very large tables are experiencing regressions especially when they upgrade impala and the serialized size goes beyond 200MB. To mitigate the issue, the change introduces a new gflag, 'inc_stats_size_limit_bytes' to make the max serialized size configurable, which allows impala users to specify their own maximum serialized size. Default value for inc_stats_size_limit_bytes is 200MB. The change includes TBackendConfig to pass backend configs, set of gflags, to frontend, Catalog and Frontend. This also revamps existing query options to use the TBackendConfig. Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 --- M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/common/global-flags.cc M be/src/service/fe-support.cc M be/src/service/frontend.cc M be/src/service/frontend.h M common/thrift/Frontend.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/authorization/User.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/FeSupport.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 19 files changed, 155 insertions(+), 128 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/4867/5 -- To view, visit http://gerrit.cloudera.org:8080/4867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: Yonghyun Hwang
[Impala-ASF-CR] IMPALA-3552: Make incremental stats max serialized size configurable
Yonghyun Hwang has uploaded a new patch set (#4). Change subject: IMPALA-3552: Make incremental stats max serialized size configurable .. IMPALA-3552: Make incremental stats max serialized size configurable The fix "IMPALA-2648/IMPALA-2664" introduced a conservative limitation on the maximum serialized size of incremental stats. As a side effect, some users with very large tables are experiencing regressions especially when they upgrade impala and the serialized size goes beyond 200MB. To mitigate the issue, the change introduces a new gflag, 'inc_stats_size_limit_bytes' to make the max serialized size configurable, which allows impala users to specify their own maximum serialized size. Default value for inc_stats_size_limit_bytes is 200MB. The change includes TBackendConfig to pass backend configs, set of gflags, to frontend, JniCatalog and JniFrontend. This also revamps existing query options to entertain the TBackendConfig. Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 --- M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/common/global-flags.cc M be/src/service/fe-support.cc M be/src/service/frontend.cc M be/src/service/frontend.h M common/thrift/Frontend.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/authorization/User.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/FeSupport.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 16 files changed, 163 insertions(+), 120 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/4867/4 -- To view, visit http://gerrit.cloudera.org:8080/4867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: Yonghyun Hwang
[Impala-ASF-CR] IMPALA-3552: Make incremental stats max serialized size configurable
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3552: Make incremental stats max serialized size configurable .. Patch Set 2: (16 comments) http://gerrit.cloudera.org:8080/#/c/4867/2//COMMIT_MSG Commit Message: PS2, Line 7: make > nit: Make Done PS2, Line 9: fox > typo Done PS2, Line 11: are experiencing : regressions especially when upgrading. > Please expand a little on what the regression is. Done Line 19: The change also includes TBackendConfig to pass backend config > Please add a note on revamp of existing query options to use the new TBacke Done http://gerrit.cloudera.org:8080/#/c/4867/2/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: PS2, Line 73: cfg.load_catalog_in_background = FLAGS_load_catalog_in_background; : cfg.num_metadata_loading_threads = FLAGS_num_metadata_loading_threads; : cfg.sentry_config.assign(FLAGS_sentry_config); : // auth_to_local rules are read if --load_auth_to_local_rules is set to true : // and impala is kerberized. : cfg.auth_to_local = FLAGS_load_auth_to_local_rules && !FLAGS_principal.empty(); : cfg.principal.assign(FLAGS_principal); : cfg.impala_log_lvl = FlagToTLogLevel(FLAGS_v); : cfg.other_log_lvl = FlagToTLogLevel(FLAGS_non_impala_java_vlog); : cfg.inc_stat_max_size = FLAGS_inc_stat_max_size; > Can you factor all this into a method, GetBackendConfig() or something simi Done http://gerrit.cloudera.org:8080/#/c/4867/2/be/src/common/global-flags.cc File be/src/common/global-flags.cc: Line 119: DEFINE_uint64(inc_stat_max_size, 200 * (1LL<<20), "Maximum size (in bytes) of incremental " > long line. Done PS2, Line 119: inc_stat_max_size > Rename to incremental_stats_size_limit_bytes or something? That way its mor Done http://gerrit.cloudera.org:8080/#/c/4867/2/be/src/service/frontend.cc File be/src/service/frontend.cc: PS2, Line 104: TBackendConfig cfg; : cfg.authorization_policy_file.assign(FLAGS_authorization_policy_file); : cfg.server_name.assign(FLAGS_server_name); : cfg.sentry_config.assign(FLAGS_sentry_config); : // auth_to_local rules are read if --load_auth_to_local_rules is set to true : // and impala is kerberized. : cfg.auth_to_local = FLAGS_load_auth_to_local_rules && !FLAGS_principal.empty(); : cfg.authorization_policy_provider_class.assign(FLAGS_authorization_policy_provider_class); : cfg.kudu_master_hosts.assign(FLAGS_kudu_master_hosts); : cfg.impala_log_lvl = FlagToTLogLevel(FLAGS_v); : cfg.other_log_lvl = FlagToTLogLevel(FLAGS_non_impala_java_vlog); : cfg.inc_stat_max_size = FLAGS_inc_stat_max_size; : : jbyteArray cfg_bytes; : JniLocalFrame jni_frame; : ABORT_IF_ERROR(jni_frame.push(jni_env)); : ABORT_IF_ERROR(SerializeThriftMsg(jni_env, &cfg, &cfg_bytes)); > Same as catalog.cc, please factor this into a method. Done http://gerrit.cloudera.org:8080/#/c/4867/2/common/thrift/Types.thrift File common/thrift/Types.thrift: Line 233: struct TBackendConfig { > Please document the usage of this thrift struct and how it is shared betwee Done PS2, Line 236: required > Some of these are only used by Catalog and not fe (and vice versa). Shouldn Done Line 245: 12: required string kudu_master_hosts > While you are at this, can you merge TStartupOptions (Frontend.thrift) with Let's do this as a separate commit http://gerrit.cloudera.org:8080/#/c/4867/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: PS2, Line 288: incStatMaxSize in bytes > --incremental_stats_size_limit_bytes Done http://gerrit.cloudera.org:8080/#/c/4867/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: PS2, Line 1591: MAX_INCREMENTAL_STATS_SIZE_BYTES > update Done http://gerrit.cloudera.org:8080/#/c/4867/2/fe/src/main/java/org/apache/impala/service/BackendConfig.java File fe/src/main/java/org/apache/impala/service/BackendConfig.java: PS2, Line 33: : // This is overriden by JniFrontend/JniCatalog classes with user set configuration. : // TODO: Read this from backend instead of using static variables. : private static boolean allowAuthToLocalRules_ = false; : : // Maximum size (in bytes) of incremental stats the catalog is : // allowed to serialize per table. This limit is set as a safety : // check, to prevent the JVM from hitting a maximum array limit of : // 1GB (or OOM) while building the thrift objects
[Impala-ASF-CR] IMPALA-3552: Make incremental stats max serialized size configurable
Yonghyun Hwang has uploaded a new patch set (#3). Change subject: IMPALA-3552: Make incremental stats max serialized size configurable .. IMPALA-3552: Make incremental stats max serialized size configurable The fix "IMPALA-2648/IMPALA-2664" introduced a conservative limitation on the maximum serialized size of incremental stats. As a side effect, some users with very large tables are experiencing regressions especially when they upgrade impala and the serialized size goes beyond 200MB. To mitigate the issue, the change introduces a new gflag, 'inc_stats_size_limit_bytes' to make the max serialized size configurable, which allows impala users to specify their own maximum serialized size. Default value for inc_stats_size_limit_bytes is 200MB. The change includes TBackendConfig to pass backend configs, set of gflags, to frontend, JniCatalog and JniFrontend. This also revamps existing query options to entertain the TBackendConfig. Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 --- M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/common/global-flags.cc M be/src/service/frontend.cc M be/src/service/frontend.h M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 11 files changed, 144 insertions(+), 64 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/4867/3 -- To view, visit http://gerrit.cloudera.org:8080/4867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: Yonghyun Hwang
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has uploaded a new patch set (#2). Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. IMPALA-3552: make incremental stats max serialized size configurable The fix fox "IMPALA-2648/IMPALA-2664" introduced a conservative limitation on the maximum serialized size of incremental stats. As a side effect, some users with very large tables are experiencing regressions especially when upgrading. To mitigate the issue, the change introduces a new gflag, 'inc_stat_max_size' to make the max serialized size configurable, which allows impala users to specify their own maximum serialized size. Default value for inc_stat_max_size is 200MB. The change also includes TBackendConfig to pass backend config to frontend. Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 --- M be/src/catalog/catalog.cc M be/src/common/global-flags.cc M be/src/service/frontend.cc M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 9 files changed, 109 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/4867/2 -- To view, visit http://gerrit.cloudera.org:8080/4867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: Yonghyun Hwang
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4867/1/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: PS1, Line 78: catalog_class_, catalog_ctor_, : load_in_background, num_metadata_loading_threads, sentry_config, : FlagToTLogLevel(FLAGS_v), FlagToTLogLevel(FLAGS_non_impala_java_vlog), : auth_to_local, principal > No for both. Then, let's include it here. :) http://gerrit.cloudera.org:8080/#/c/4867/1/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: PS1, Line 83: boolean loadInBackground, int numMetadataLoadingThreads, : String sentryServiceConfig, int impalaLogLevel, int otherLogLevel, : boolean allowAuthToLocal, String kerberosPrincipal > Its a good ramp-up task and I think we should do it here. This clean-up has Agreed. However, I think current fix contains enough changes. I will create a new jira ticket and take care of the clean up right after this commit. I want to minimize the scope of changes for one commit. -- To view, visit http://gerrit.cloudera.org:8080/4867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/4867/1/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: PS1, Line 78: catalog_class_, catalog_ctor_, : load_in_background, num_metadata_loading_threads, sentry_config, : FlagToTLogLevel(FLAGS_v), FlagToTLogLevel(FLAGS_non_impala_java_vlog), : auth_to_local, principal > Since most of the code is shared between Catalog and Frontend, I think its Passing a TBackendConfig to Catalog::Catalog is a good idea. However, two questions. 1) are there any cases where two catalogs get instantiated? 2) do we want to change config even after catalog gets instantiated? http://gerrit.cloudera.org:8080/#/c/4867/1/common/thrift/Types.thrift File common/thrift/Types.thrift: PS1, Line 233: TBackendConfig > I think it was named this way to mean the "configs passed to the Backend". I also think this would not be the best place. please recommend better one. unless, let's keep this here for now. For the name, I think 'TBackendConfig' is good for now, because we use BE/FE and BE takes a config options. http://gerrit.cloudera.org:8080/#/c/4867/1/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: PS1, Line 83: boolean loadInBackground, int numMetadataLoadingThreads, : String sentryServiceConfig, int impalaLogLevel, int otherLogLevel, : boolean allowAuthToLocal, String kerberosPrincipal > How about moving all these into TBackendConfig? Agreed. However, Let's do that in a separate commit. http://gerrit.cloudera.org:8080/#/c/4867/1/fe/src/main/java/org/apache/impala/service/JniFrontend.java File fe/src/main/java/org/apache/impala/service/JniFrontend.java: PS1, Line 120: boolean lazy, String serverName, String authorizationPolicyFile, : String sentryConfigFile, String authPolicyProviderClass, int impalaLogLevel, : int otherLogLevel, boolean allowAuthToLocal, String defaultKuduMasterHosts > Same, how about we make TBackendConfig more generic and refactor all the co Sure. Let's do that in a separate commit. -- To view, visit http://gerrit.cloudera.org:8080/4867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: I still have two questions on the fix. 1. currently, tbackendconfig is introduced in types.thrift. is there a better place? (i.e., create a new file solely for this?) 2. tbackendconfig is passed in Catalog::Catalog and ExecEnv::ExecEnv. is it a good place? Please let me have feedbacks on those two. -- To view, visit http://gerrit.cloudera.org:8080/4867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has abandoned this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Abandoned after the 1st review, major change was made for the fix. hence, a new review started for this issue. -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has uploaded a new change for review. http://gerrit.cloudera.org:8080/4867 Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. IMPALA-3552: make incremental stats max serialized size configurable The fix fox "IMPALA-2648/IMPALA-2664" introduced a conservative limitation on the maximum serialized size of incremental stats. As a side effect, some users with very large tables are experiencing regressions especially when upgrading. To mitigate the issue, the change introduces a new gflag, 'inc_stat_max_size' to make the max serialized size configurable, which allows impala users to specify their own maximum serialized size. Default value for inc_stat_max_size is 200MB. The change also includes TBackendConfig to pass backend config to frontend. Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/common/global-flags.cc M be/src/runtime/exec-env.cc M be/src/service/frontend.cc M be/src/service/frontend.h M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 13 files changed, 77 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/4867/1 -- To view, visit http://gerrit.cloudera.org:8080/4867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: PS1, Line 169: MAX_INCREMENTAL_STATS_SIZE_BYTES > I think he can try to use runtimeEnv as well. or have this in catalog not i To my best understanding, MAX_INC_*_BYTES is used in both of impalad and catalogd. So, RuntimeEnv solely could not be used. (libfesupport.so would complain during runtime.) Initially, I also thought BackendConfig is the best option for this because we can remove MAX_INC_*_BYTES from HdfsTable and consolidate BE options in one place. :) Alex/Huaisi/Bharath, if we decide to use BackendConfig, I will abandon this change and re-do the work, which would take few days. Please let me have your inputs. -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/4772/1/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: PS1, Line 33: inc_stat_max_size > could you be explicit on the unit? thanks. I will add the unit in option description. (in DEFINE_uint64(...) http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: Line 164: SentryConfig sentryConfig, TUniqueId catalogServiceId, String kerberosPrincipal, long incStatMaxSize) { > long line I will take care of this. http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: PS1, Line 176: 0 > sorry I meant should not be a public static.. Actually, HdfsTable.MAX_INCREMENTAL_STATS_SIZE_BYTES is set in CatalogServiceCatalog. Let me make this as a private var and create a method to set it. http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/common/RuntimeEnv.java File fe/src/main/java/org/apache/impala/common/RuntimeEnv.java: PS1, Line 71: incStatMaxSize > getIncStatMaxSize Sure. :) Do you recommend me to update computeLineage() as well? http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: Line 84: boolean allowAuthToLocal, String kerberosPrincipal, long incStatMaxSize) throws InternalException { > long line will take care of it. Line 101: numMetadataLoadingThreads, sentryConfig, getServiceId(), kerberosPrincipal, incStatMaxSize); > long will take care of it. http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java: Line 34: super(loadInBackground, numLoadingThreads, sentryConfig, catalogServiceId, null, 200*1024*1024); > long will take care of it. thanks. :) -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: local tests as well as jenkins are passing. http://sandbox.jenkins.cloudera.com/job/impala-private-build-and-test/4447/ -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has uploaded a new change for review. http://gerrit.cloudera.org:8080/4772 Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. IMPALA-3552: make incremental stats max serialized size configurable The fix fox "IMPALA-2648/IMPALA-2664" introduced a conservative limitation on the maximum serialized size of incremental stats. As a side effect, some users with very large tables are experiencing regressions especially when upgrading. To mitigate the issue, the change introduces a new gflag, 'inc_stat_max_size' to make the max serialized size configurable, which allows impala users to specify their own maximum serialized size. Default value for inc_stat_max_size is 200MB. Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 --- M be/src/catalog/catalog.cc M be/src/common/global-flags.cc M be/src/service/fe-support.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java 10 files changed, 33 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/4772/1 -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang
[Impala-ASF-CR] IMPALA-3342: Adding new timer to accurately measure the TotalCpuTime
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3342: Adding new timer to accurately measure the TotalCpuTime .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4633/1/be/src/util/stopwatch.h File be/src/util/stopwatch.h: PS1, Line 237: int64_t utime_diff = (end.ru_stime.tv_sec - start_.ru_stime.tv_sec) * 1000L * 1000L * 1000L : + (end.ru_stime.tv_usec - start_.ru_stime.tv_usec); just 2 simple questions. 1. are we considering "system CPU time" only (excluding "user CPU time")? 2. by any chance, (someone resets resource usage?) can utime_diff be negative? -- To view, visit http://gerrit.cloudera.org:8080/4633 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4042: Preserve root types when substituting grouping exprs
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-4042: Preserve root types when substituting grouping exprs .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/4600/3//COMMIT_MSG Commit Message: Line 7: IMPALA-4042: count(distinct NULL) fails on a view > This should describe the fix, not the problem. In particular, it should not Done http://gerrit.cloudera.org:8080/#/c/4600/3/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java: Line 339: // preserveRootType is set to true because groupingExprs_ would > shorter: Done http://gerrit.cloudera.org:8080/#/c/4600/3/testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test File testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test: Line 1066: # IMPALA-4042: count(distinct NULL) fails on a view > move into distinct.test Done -- To view, visit http://gerrit.cloudera.org:8080/4600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4042: Preserve root types when substituting grouping exprs
Yonghyun Hwang has uploaded a new patch set (#4). Change subject: IMPALA-4042: Preserve root types when substituting grouping exprs .. IMPALA-4042: Preserve root types when substituting grouping exprs In case of count(distinct), FunctionCallExpr.analyze() changes type for "NULL" into "BOOLEAN" to make sure that BE doesn't see any "NULL_TYPE" exprs. In the meantime, Expr substitution, happening in Expr.substituteImpl() reverts this change back to original type, "NULL_TYPE". This causes an issue when AggregateInfo.checkConsistency() performs precondition check where slot types from AggregateInfo.outputTupleDesc_ should be matched with the types from AggregateInfo.groupingExpr_. The slot type shows "BOOLEAN" while type from groupingExpr_ is "NULL_TYPE", which makes the precondition fail and throws an exception. To resolve the issue, preserveRootType is set to true when Expr.substituteList() gets called in AggregateInfo.substitute() Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 (cherry picked from commit b17785b4890bedd1c825140ce3c48cd7d9734295) --- M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test 2 files changed, 25 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/4600/4 -- To view, visit http://gerrit.cloudera.org:8080/4600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Yonghyun Hwang
[Impala-ASF-CR] IMPALA-4042: count(distinct NULL) fails on a view
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-4042: count(distinct NULL) fails on a view .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4600/2/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: Line 734: if (smap == null || this instanceof NullLiteral) return result; > Agreed w/ the design for Expr. I am updating the fix to use preserveRootTyp Done -- To view, visit http://gerrit.cloudera.org:8080/4600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4042: count(distinct NULL) fails on a view
Yonghyun Hwang has uploaded a new patch set (#3). Change subject: IMPALA-4042: count(distinct NULL) fails on a view .. IMPALA-4042: count(distinct NULL) fails on a view In case of count(distinct), FunctionCallExpr.analyze() changes type for "NULL" into "BOOLEAN" to make sure that BE doesn't see any "NULL_TYPE" exprs. In the meantime, Expr substitution, happening in Expr.substituteImpl() reverts this change back to original type, "NULL_TYPE". This causes an issue when AggregateInfo.checkConsistency() performs precondition check where slot types from AggregateInfo.outputTupleDesc_ should be matched with the types from AggregateInfo.groupingExpr_. The slot type shows "BOOLEAN" while type from groupingExpr_ is "NULL_TYPE", which makes the precondition fail and throws an exception. To resolve the issue, preserveRootType is set to true when Expr.substituteList() gets called in AggregateInfo.substitute() Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 (cherry picked from commit b17785b4890bedd1c825140ce3c48cd7d9734295) --- M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test 2 files changed, 27 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/4600/3 -- To view, visit http://gerrit.cloudera.org:8080/4600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Yonghyun Hwang
[Impala-ASF-CR] IMPALA-4042: count(distinct NULL) fails on a view
Yonghyun Hwang has uploaded a new patch set (#2). Change subject: IMPALA-4042: count(distinct NULL) fails on a view .. IMPALA-4042: count(distinct NULL) fails on a view In case of count(distinct), FunctionCallExpr.analyze() changes type for "NULL" into "BOOLEAN" to make sure that BE doesn't see any "NULL_TYPE" exprs. In the meantime, Expr substitution, happening in Expr.substituteImpl() reverts this change back to original type, "NULL_TYPE". This causes an issue when AggregateInfo.checkConsistency() performs precondition check where slot types from AggregateInfo.outputTupleDesc_ should be matched with the types from AggregateInfo.groupingExpr_. The slot type shows "BOOLEAN" while type from groupingExpr_ is "NULL_TYPE", which makes the precondition fail and throws an exception. To resolve the issue, Expr.trySubstitute() is updated not to substitue NullLiteral. Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test 2 files changed, 23 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/4600/2 -- To view, visit http://gerrit.cloudera.org:8080/4600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang
[Impala-ASF-CR] IMPALA-4042: count(distinct NULL) fails on a view
Yonghyun Hwang has uploaded a new change for review. http://gerrit.cloudera.org:8080/4600 Change subject: IMPALA-4042: count(distinct NULL) fails on a view .. IMPALA-4042: count(distinct NULL) fails on a view In case of count(distinct), FunctionCallExpr.analyze() changes type for "NULL" into "BOOLEAN" to make sure that BE doesn't see any "NULL_TYPE" exprs. In the meantime, Expr substitution, happening in Expr.substituteImpl() reverts this change back to original type, "NULL_TYPE". This causes an issue when AggregateInfo.checkConsistency() performs precondition check where slot types from AggregateInfo.outputTupleDesc_ should be matched with the types from AggregateInfo.groupingExpr_. The slot type shows "BOOLEAN" while type from groupingExpr_ is "NULL_TYPE", which makes the precondition fail and throws an exception. To resolve the issue, Expr.trySubstitute() is updated not to substitue NullLiteral. Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 --- M fe/src/main/java/com/cloudera/impala/analysis/Expr.java M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test 2 files changed, 23 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/4600/1 -- To view, visit http://gerrit.cloudera.org:8080/4600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang