IMPALA-2021: S3: Flaky tests: impala-s3 job sometimes encounters I/O error 255
Through emprical analysis, it was determined that setting the maximum number of connections to S3 as 1500 was optimal for functionality and performance. The hadoop set default of 15 connections could lead us to have deadlocks as our parquet scanner requires that we have multiple concurrent open connections proportional to the number of columns that we are scanning. Setting it to this high a value does not seem to have any negative implications. This has also been found to fix the Error(255): Unknown errors. Change-Id: Ide6f1326d5155b2e5f4da3a3f23df3f3d40c5a8d Reviewed-on: http://gerrit.cloudera.org:8080/3114 Reviewed-by: Sailesh Mukil <[email protected]> Tested-by: Internal Jenkins Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/d2c3c871 Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/d2c3c871 Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/d2c3c871 Branch: refs/heads/master Commit: d2c3c8711b7fc6783044572e1f7259b1bc769dd4 Parents: f7501d2 Author: Sailesh Mukil <[email protected]> Authored: Thu May 5 11:42:19 2016 -0700 Committer: Tim Armstrong <[email protected]> Committed: Mon May 23 08:40:19 2016 -0700 ---------------------------------------------------------------------- .../node_templates/common/etc/hadoop/conf/core-site.xml.tmpl | 5 +++++ 1 file changed, 5 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d2c3c871/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl ---------------------------------------------------------------------- diff --git a/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl b/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl index ab86878..2d5cb09 100644 --- a/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl +++ b/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl @@ -78,6 +78,11 @@ DEFAULT</value> <value>${AWS_SECRET_ACCESS_KEY}</value> </property> + <property> + <name>fs.s3a.connection.maximum</name> + <value>1500</value> + </property> + <!-- Location of the KMS key provider --> <property> <name>hadoop.security.key.provider.path</name>
