[jira] [Commented] (HADOOP-15229) Add FileSystem builder-based openFile() API to match createFile() + S3 Select
[ https://issues.apache.org/jira/browse/HADOOP-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734767#comment-16734767 ] Sameer Choudhary commented on HADOOP-15229: --- [~ste...@apache.org] {quote}OK. here's a question: what is logged in the AWS S3 logs on a request, and does it include any of the SQL statement? I ask as in the security bit of the docs I've added the words "GDPR" alongside security, and how if the statements include PII then they'd better not be printed. {quote} Personal Identifiable Information is not printed in the logs. The logs contain only IonSql keywords from the query planner. All column names and literals are masked. Following is a sample log example: {quote}*Query:* select * from S3Object s; *Log:* select (project (list (project_all))) (from (as str0 (id str1 case_insensitive))) {quote} > Add FileSystem builder-based openFile() API to match createFile() + S3 Select > - > > Key: HADOOP-15229 > URL: https://issues.apache.org/jira/browse/HADOOP-15229 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, fs/azure, fs/s3 >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15229-001.patch, HADOOP-15229-002.patch, > HADOOP-15229-003.patch, HADOOP-15229-004.patch, HADOOP-15229-004.patch, > HADOOP-15229-005.patch, HADOOP-15229-006.patch, HADOOP-15229-007.patch, > HADOOP-15229-009.patch, HADOOP-15229-010.patch, HADOOP-15229-011.patch, > HADOOP-15229-012.patch, HADOOP-15229-013.patch, HADOOP-15229-014.patch, > HADOOP-15229-015.patch, HADOOP-15229-016.patch > > > Replicate HDFS-1170 and HADOOP-14365 with an API to open files. > A key requirement of this is not HDFS, it's to put in the fadvise policy for > working with object stores, where getting the decision to do a full GET and > TCP abort on seek vs smaller GETs is fundamentally different: the wrong > option can cost you minutes. S3A and Azure both have adaptive policies now > (first backward seek), but they still don't do it that well. > Columnar formats (ORC, Parquet) should be able to say "fs.input.fadvise" > "random" as an option when they open files; I can imagine other options too. > The Builder model of [~eddyxu] is the one to mimic, method for method. > Ideally with as much code reuse as possible -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] lqjack opened a new pull request #456: remove the task attempt id from earlier failed map
lqjack opened a new pull request #456: remove the task attempt id from earlier failed map URL: https://github.com/apache/hadoop/pull/456 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15229) Add FileSystem builder-based openFile() API to match createFile() + S3 Select
[ https://issues.apache.org/jira/browse/HADOOP-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734734#comment-16734734 ] Hadoop QA commented on HADOOP-15229: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 24 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 54s{color} | {color:green} root generated 0 new + 1488 unchanged - 2 fixed = 1488 total (was 1490) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 44s{color} | {color:orange} root: The patch generated 28 new + 1097 unchanged - 3 fixed = 1125 total (was 1100) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 17s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 173 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 5s{color} | {color:red} The patch 1 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 59s{color} | {color:red} hadoop-tools/hadoop-aws generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 21s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 57s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 49s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 50s{color} | {color:red} hadoop-streaming in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 56s{color} | {color:green} hadoop-aws in the patch passed. {color} | | {color:green}+1{color}
[jira] [Updated] (HADOOP-15996) Plugin interface to support more complex usernames in Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated HADOOP-15996: --- Release Note: This patch enables "Hadoop" and "MIT" as options for "hadoop.security.auth_to_local.mechanism" and defaults to 'hadoop'. This should be backward compatible with pre-HADOOP-12751. This is basically HADOOP-12751 plus configurable + extended tests. was: This patch enables "Hadoop" and "MIT" as options for "hadoop.security.auth_to_local.mechanism" and defaults to 'legacy'. This should be backward compatible with pre-HADOOP-12751. This is basically HADOOP-12751 plus configurable + extended tests. > Plugin interface to support more complex usernames in Hadoop > > > Key: HADOOP-15996 > URL: https://issues.apache.org/jira/browse/HADOOP-15996 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Reporter: Eric Yang >Assignee: Bolke de Bruin >Priority: Major > Fix For: 3.2.0, 3.3.0, 3.1.2 > > Attachments: 0001-HADOOP-15996-Make-auth-to-local-configurable.patch, > 0001-Make-allowing-or-configurable.patch, > 0001-Simple-trial-of-using-krb5.conf-for-auth_to_local-ru.patch, > 0002-HADOOP-15996-Make-auth-to-local-configurable.patch, > 0003-HADOOP-15996-Make-auth-to-local-configurable.patch, > 0004-HADOOP-15996-Make-auth-to-local-configurable.patch, > 0005-HADOOP-15996-Make-auth-to-local-configurable.patch, > HADOOP-15996.0005.patch, HADOOP-15996.0006.patch, HADOOP-15996.0007.patch, > HADOOP-15996.0008.patch, HADOOP-15996.0009.patch, HADOOP-15996.0010.patch, > HADOOP-15996.0011.patch, HADOOP-15996.0012.patch > > > Hadoop does not allow support of @ character in username in recent security > mailing list vote to revert HADOOP-12751. Hadoop auth_to_local rule must > match to authorize user to login to Hadoop cluster. This design does not > work well in multi-realm environment where identical username between two > realms do not map to the same user. There is also possibility that lossy > regex can incorrectly map users. In the interest of supporting multi-realms, > it maybe preferred to pass principal name without rewrite to uniquely > distinguish users. This jira is to revisit if Hadoop can support full > principal names without rewrite and provide a plugin to override Hadoop's > default implementation of auth_to_local for multi-realm use case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15922) DelegationTokenAuthenticationFilter get wrong doAsUser since it does not decode URL
[ https://issues.apache.org/jira/browse/HADOOP-15922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734679#comment-16734679 ] Eric Yang commented on HADOOP-15922: [~hexiaoqiao] HADOOP-15996 is committed. This means the doAs parameter may contain username with @ character when multi-realms username can not be shorten. Can you adjust the test case to cover hadoop.security.auth_to_local.mechanism=mit and auth_to_local without DEFAULT rule? > DelegationTokenAuthenticationFilter get wrong doAsUser since it does not > decode URL > --- > > Key: HADOOP-15922 > URL: https://issues.apache.org/jira/browse/HADOOP-15922 > Project: Hadoop Common > Issue Type: Bug > Components: common, kms >Reporter: He Xiaoqiao >Assignee: He Xiaoqiao >Priority: Major > Fix For: 3.3.0, 3.1.2, 3.2.1 > > Attachments: HADOOP-15922.001.patch, HADOOP-15922.002.patch, > HADOOP-15922.003.patch, HADOOP-15922.004.patch, HADOOP-15922.005.patch, > HADOOP-15922.006.patch > > > DelegationTokenAuthenticationFilter get wrong doAsUser when proxy user from > client is complete kerberos name (e.g., user/hostn...@realm.com, actually it > is acceptable), because DelegationTokenAuthenticationFilter does not decode > DOAS parameter in URL which is encoded by {{URLEncoder}} at client. > e.g. KMS as example: > a. KMSClientProvider creates connection to KMS Server using > DelegationTokenAuthenticatedURL#openConnection. > b. If KMSClientProvider is a doAsUser, KMSClientProvider will put {{doas}} > with url encoded user as one parameter of http request. > {code:java} > // proxyuser > if (doAs != null) { > extraParams.put(DO_AS, URLEncoder.encode(doAs, "UTF-8")); > } > {code} > c. when KMS server receives the request, it does not decode the proxy user. > As result, KMS Server will get the wrong proxy user if this proxy user is > complete Kerberos Name or it includes some special character. Some other > authentication and authorization exception will throws next to it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15959) revert HADOOP-12751
[ https://issues.apache.org/jira/browse/HADOOP-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734674#comment-16734674 ] Eric Yang commented on HADOOP-15959: [~ste...@apache.org] Base on commit of HADOOP-15996, the test must choose between Hadoop mode or MIT mode. If the test is intended for Hadoop mode, the test must apply DEFAULT rule. If the test is intended for MIT mode, then it must set hadoop.security.auth_to_local.mechanism to mit in test conf object. > revert HADOOP-12751 > --- > > Key: HADOOP-15959 > URL: https://issues.apache.org/jira/browse/HADOOP-15959 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.2.0, 3.1.1, 2.9.2, 3.0.3, 2.7.7, 2.8.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 3.2.0, 2.7.8, 3.0.4, 3.1.2, 2.8.6, 2.9.3 > > Attachments: HADOOP-15959-001.patch, HADOOP-15959-branch-2-002.patch, > HADOOP-15959-branch-2.7-003.patch > > > HADOOP-12751 doesn't quite work right. Revert. > (this patch is so jenkins can do the test runs) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15996) Plugin interface to support more complex usernames in Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734666#comment-16734666 ] Hudson commented on HADOOP-15996: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15707 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15707/]) HADOOP-15996. Improved Kerberos username mapping strategy in Hadoop. (eyang: rev d43af8b3db4743b4b240751b6f29de6c20cfd6e5) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * (edit) hadoop-common-project/hadoop-common/src/site/markdown/SecureMode.md * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestKDiag.java * (edit) hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/util/TestKerberosName.java * (edit) hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/util/KerberosName.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/KDiag.java * (edit) hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/http/HttpServer2.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/HadoopKerberosName.java * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestUserGroupInformation.java * (edit) hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/KerberosAuthenticationHandler.java * (edit) hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/TestKerberosAuthenticator.java * (edit) hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/server/TestKerberosAuthenticationHandler.java > Plugin interface to support more complex usernames in Hadoop > > > Key: HADOOP-15996 > URL: https://issues.apache.org/jira/browse/HADOOP-15996 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Reporter: Eric Yang >Assignee: Bolke de Bruin >Priority: Major > Fix For: 3.2.0, 3.3.0, 3.1.2 > > Attachments: 0001-HADOOP-15996-Make-auth-to-local-configurable.patch, > 0001-Make-allowing-or-configurable.patch, > 0001-Simple-trial-of-using-krb5.conf-for-auth_to_local-ru.patch, > 0002-HADOOP-15996-Make-auth-to-local-configurable.patch, > 0003-HADOOP-15996-Make-auth-to-local-configurable.patch, > 0004-HADOOP-15996-Make-auth-to-local-configurable.patch, > 0005-HADOOP-15996-Make-auth-to-local-configurable.patch, > HADOOP-15996.0005.patch, HADOOP-15996.0006.patch, HADOOP-15996.0007.patch, > HADOOP-15996.0008.patch, HADOOP-15996.0009.patch, HADOOP-15996.0010.patch, > HADOOP-15996.0011.patch, HADOOP-15996.0012.patch > > > Hadoop does not allow support of @ character in username in recent security > mailing list vote to revert HADOOP-12751. Hadoop auth_to_local rule must > match to authorize user to login to Hadoop cluster. This design does not > work well in multi-realm environment where identical username between two > realms do not map to the same user. There is also possibility that lossy > regex can incorrectly map users. In the interest of supporting multi-realms, > it maybe preferred to pass principal name without rewrite to uniquely > distinguish users. This jira is to revisit if Hadoop can support full > principal names without rewrite and provide a plugin to override Hadoop's > default implementation of auth_to_local for multi-realm use case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15996) Plugin interface to support more complex usernames in Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated HADOOP-15996: --- Resolution: Fixed Fix Version/s: 3.1.2 3.3.0 3.2.0 Status: Resolved (was: Patch Available) [~bolke] Thank you for the patch. There is no objections in the past 3 days. I commit this change to branch-3.1, branch-3.2 and trunk. [~sunilg] [~wangda] Please make sure 3.2.0 and 3.1.2 releases include this change. Thanks > Plugin interface to support more complex usernames in Hadoop > > > Key: HADOOP-15996 > URL: https://issues.apache.org/jira/browse/HADOOP-15996 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Reporter: Eric Yang >Assignee: Bolke de Bruin >Priority: Major > Fix For: 3.2.0, 3.3.0, 3.1.2 > > Attachments: 0001-HADOOP-15996-Make-auth-to-local-configurable.patch, > 0001-Make-allowing-or-configurable.patch, > 0001-Simple-trial-of-using-krb5.conf-for-auth_to_local-ru.patch, > 0002-HADOOP-15996-Make-auth-to-local-configurable.patch, > 0003-HADOOP-15996-Make-auth-to-local-configurable.patch, > 0004-HADOOP-15996-Make-auth-to-local-configurable.patch, > 0005-HADOOP-15996-Make-auth-to-local-configurable.patch, > HADOOP-15996.0005.patch, HADOOP-15996.0006.patch, HADOOP-15996.0007.patch, > HADOOP-15996.0008.patch, HADOOP-15996.0009.patch, HADOOP-15996.0010.patch, > HADOOP-15996.0011.patch, HADOOP-15996.0012.patch > > > Hadoop does not allow support of @ character in username in recent security > mailing list vote to revert HADOOP-12751. Hadoop auth_to_local rule must > match to authorize user to login to Hadoop cluster. This design does not > work well in multi-realm environment where identical username between two > realms do not map to the same user. There is also possibility that lossy > regex can incorrectly map users. In the interest of supporting multi-realms, > it maybe preferred to pass principal name without rewrite to uniquely > distinguish users. This jira is to revisit if Hadoop can support full > principal names without rewrite and provide a plugin to override Hadoop's > default implementation of auth_to_local for multi-realm use case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15229) Add FileSystem builder-based openFile() API to match createFile() + S3 Select
[ https://issues.apache.org/jira/browse/HADOOP-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-15229: Summary: Add FileSystem builder-based openFile() API to match createFile() + S3 Select (was: Add FileSystem builder-based openFile() API to match createFile()) > Add FileSystem builder-based openFile() API to match createFile() + S3 Select > - > > Key: HADOOP-15229 > URL: https://issues.apache.org/jira/browse/HADOOP-15229 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, fs/azure, fs/s3 >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15229-001.patch, HADOOP-15229-002.patch, > HADOOP-15229-003.patch, HADOOP-15229-004.patch, HADOOP-15229-004.patch, > HADOOP-15229-005.patch, HADOOP-15229-006.patch, HADOOP-15229-007.patch, > HADOOP-15229-009.patch, HADOOP-15229-010.patch, HADOOP-15229-011.patch, > HADOOP-15229-012.patch, HADOOP-15229-013.patch, HADOOP-15229-014.patch, > HADOOP-15229-015.patch > > > Replicate HDFS-1170 and HADOOP-14365 with an API to open files. > A key requirement of this is not HDFS, it's to put in the fadvise policy for > working with object stores, where getting the decision to do a full GET and > TCP abort on seek vs smaller GETs is fundamentally different: the wrong > option can cost you minutes. S3A and Azure both have adaptive policies now > (first backward seek), but they still don't do it that well. > Columnar formats (ORC, Parquet) should be able to say "fs.input.fadvise" > "random" as an option when they open files; I can imagine other options too. > The Builder model of [~eddyxu] is the one to mimic, method for method. > Ideally with as much code reuse as possible -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15229) Add FileSystem builder-based openFile() API to match createFile() + S3 Select
[ https://issues.apache.org/jira/browse/HADOOP-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-15229: Attachment: HADOOP-15229-016.patch > Add FileSystem builder-based openFile() API to match createFile() + S3 Select > - > > Key: HADOOP-15229 > URL: https://issues.apache.org/jira/browse/HADOOP-15229 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, fs/azure, fs/s3 >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15229-001.patch, HADOOP-15229-002.patch, > HADOOP-15229-003.patch, HADOOP-15229-004.patch, HADOOP-15229-004.patch, > HADOOP-15229-005.patch, HADOOP-15229-006.patch, HADOOP-15229-007.patch, > HADOOP-15229-009.patch, HADOOP-15229-010.patch, HADOOP-15229-011.patch, > HADOOP-15229-012.patch, HADOOP-15229-013.patch, HADOOP-15229-014.patch, > HADOOP-15229-015.patch, HADOOP-15229-016.patch > > > Replicate HDFS-1170 and HADOOP-14365 with an API to open files. > A key requirement of this is not HDFS, it's to put in the fadvise policy for > working with object stores, where getting the decision to do a full GET and > TCP abort on seek vs smaller GETs is fundamentally different: the wrong > option can cost you minutes. S3A and Azure both have adaptive policies now > (first backward seek), but they still don't do it that well. > Columnar formats (ORC, Parquet) should be able to say "fs.input.fadvise" > "random" as an option when they open files; I can imagine other options too. > The Builder model of [~eddyxu] is the one to mimic, method for method. > Ideally with as much code reuse as possible -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15229) Add FileSystem builder-based openFile() API to match createFile() + S3 Select
[ https://issues.apache.org/jira/browse/HADOOP-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-15229: Status: Patch Available (was: Open) > Add FileSystem builder-based openFile() API to match createFile() + S3 Select > - > > Key: HADOOP-15229 > URL: https://issues.apache.org/jira/browse/HADOOP-15229 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, fs/azure, fs/s3 >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15229-001.patch, HADOOP-15229-002.patch, > HADOOP-15229-003.patch, HADOOP-15229-004.patch, HADOOP-15229-004.patch, > HADOOP-15229-005.patch, HADOOP-15229-006.patch, HADOOP-15229-007.patch, > HADOOP-15229-009.patch, HADOOP-15229-010.patch, HADOOP-15229-011.patch, > HADOOP-15229-012.patch, HADOOP-15229-013.patch, HADOOP-15229-014.patch, > HADOOP-15229-015.patch, HADOOP-15229-016.patch > > > Replicate HDFS-1170 and HADOOP-14365 with an API to open files. > A key requirement of this is not HDFS, it's to put in the fadvise policy for > working with object stores, where getting the decision to do a full GET and > TCP abort on seek vs smaller GETs is fundamentally different: the wrong > option can cost you minutes. S3A and Azure both have adaptive policies now > (first backward seek), but they still don't do it that well. > Columnar formats (ORC, Parquet) should be able to say "fs.input.fadvise" > "random" as an option when they open files; I can imagine other options too. > The Builder model of [~eddyxu] is the one to mimic, method for method. > Ideally with as much code reuse as possible -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15229) Add FileSystem builder-based openFile() API to match createFile() + S3 Select
[ https://issues.apache.org/jira/browse/HADOOP-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-15229: Status: Open (was: Patch Available) > Add FileSystem builder-based openFile() API to match createFile() + S3 Select > - > > Key: HADOOP-15229 > URL: https://issues.apache.org/jira/browse/HADOOP-15229 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, fs/azure, fs/s3 >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15229-001.patch, HADOOP-15229-002.patch, > HADOOP-15229-003.patch, HADOOP-15229-004.patch, HADOOP-15229-004.patch, > HADOOP-15229-005.patch, HADOOP-15229-006.patch, HADOOP-15229-007.patch, > HADOOP-15229-009.patch, HADOOP-15229-010.patch, HADOOP-15229-011.patch, > HADOOP-15229-012.patch, HADOOP-15229-013.patch, HADOOP-15229-014.patch, > HADOOP-15229-015.patch > > > Replicate HDFS-1170 and HADOOP-14365 with an API to open files. > A key requirement of this is not HDFS, it's to put in the fadvise policy for > working with object stores, where getting the decision to do a full GET and > TCP abort on seek vs smaller GETs is fundamentally different: the wrong > option can cost you minutes. S3A and Azure both have adaptive policies now > (first backward seek), but they still don't do it that well. > Columnar formats (ORC, Parquet) should be able to say "fs.input.fadvise" > "random" as an option when they open files; I can imagine other options too. > The Builder model of [~eddyxu] is the one to mimic, method for method. > Ideally with as much code reuse as possible -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15229) Add FileSystem builder-based openFile() API to match createFile()
[ https://issues.apache.org/jira/browse/HADOOP-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734576#comment-16734576 ] Yuzhou Sun commented on HADOOP-15229: - Thanks [~ste...@apache.org], "have the codec return MAX_INT" make sense. btw there is a typo in WriteOperationHelper: "Owniung filesystem." > Add FileSystem builder-based openFile() API to match createFile() > - > > Key: HADOOP-15229 > URL: https://issues.apache.org/jira/browse/HADOOP-15229 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, fs/azure, fs/s3 >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15229-001.patch, HADOOP-15229-002.patch, > HADOOP-15229-003.patch, HADOOP-15229-004.patch, HADOOP-15229-004.patch, > HADOOP-15229-005.patch, HADOOP-15229-006.patch, HADOOP-15229-007.patch, > HADOOP-15229-009.patch, HADOOP-15229-010.patch, HADOOP-15229-011.patch, > HADOOP-15229-012.patch, HADOOP-15229-013.patch, HADOOP-15229-014.patch, > HADOOP-15229-015.patch > > > Replicate HDFS-1170 and HADOOP-14365 with an API to open files. > A key requirement of this is not HDFS, it's to put in the fadvise policy for > working with object stores, where getting the decision to do a full GET and > TCP abort on seek vs smaller GETs is fundamentally different: the wrong > option can cost you minutes. S3A and Azure both have adaptive policies now > (first backward seek), but they still don't do it that well. > Columnar formats (ORC, Parquet) should be able to say "fs.input.fadvise" > "random" as an option when they open files; I can imagine other options too. > The Builder model of [~eddyxu] is the one to mimic, method for method. > Ideally with as much code reuse as possible -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15898) 1 - 1.5 TB Data size fails to run with the following error
[ https://issues.apache.org/jira/browse/HADOOP-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srinivas updated HADOOP-15898: -- Description: There is a business impact MR job which runs every day @ 2.00 PM PST and data size is about 1 - 1.5 TB (depends on the business days) . Ideal elapsed time of this job : 4 hrs. But the multiple mappers of this job simultaneously failing with the following error so job will take some times 11 and even 13 hours also like that. Steps to prevent this problem : 1, Migrated the environment to Yarn .2 increased the ulimit 3. Added extra nodes to the cluster. 4. Disks replacement taking place regularly 5. Monitoring the cluster and terminating other jobs which impacts this job. Few of the values that we tried increasing without any benefit are 1. increased open files 2. increase dfs.datanode.handler.count 3. increase dfs.datanode.max.xcievers 4. increase dfs.datanode.max.transfer.threads But no luck. org.apache.hadoop.hdfs.DFSClient: Error Recovery for block BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089 in pipeline DatanodeInfoWithStorage [10.0.1.37:50010,DS-ed333d2e-839a-4029-a1c9-b6615c322ed2,DISK], DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK], DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]:( bad datanode DatanodeInfoWithStorage[10.0.1.37:50010,DS-ed333d2e-839a-4029-a1c9-b6615c322ed2,DISK] org.apache.hadoop.hdfs.DFSClient: Error Recovery for block BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089 in pipeline DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK], DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]: bad datanode DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: java.io.IOException: All datanodes DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK] are bad. Aborting... at was: There is a business impact MR job which runs every day @ 2.00 PM PST and data size is about 1 - 1.5 TB (depends on the business days) . Ideal elapsed time of this job : 4 hrs. But the multiple mappers of this job simultaneously failing with the following error so job will take some times 11 and even 13 hours also like that. Steps to prevent this problem : 1, Migrated the environment to Yarn .2 increased the ulimit 3. Added extra nodes to the cluster. 4. Disks replacement taking place regularly 5. Monitoring the cluster and terminating other jobs which impacts this job. But no luck. org.apache.hadoop.hdfs.DFSClient: Error Recovery for block BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089 in pipeline DatanodeInfoWithStorage [10.0.1.37:50010,DS-ed333d2e-839a-4029-a1c9-b6615c322ed2,DISK], DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK], DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]:( bad datanode DatanodeInfoWithStorage[10.0.1.37:50010,DS-ed333d2e-839a-4029-a1c9-b6615c322ed2,DISK] org.apache.hadoop.hdfs.DFSClient: Error Recovery for block BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089 in pipeline DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK], DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]: bad datanode DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: java.io.IOException: All datanodes DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK] are bad. Aborting... at > 1 - 1.5 TB Data size fails to run with the following error > --- > > Key: HADOOP-15898 > URL: https://issues.apache.org/jira/browse/HADOOP-15898 > Project: Hadoop Common > Issue Type: Improvement > Components: performance >Affects Versions: 2.6.0 > Environment: Hadoop 2.6.0-cdh5.5.1 Express edition. > > >Reporter: Srinivas >Priority: Major > Labels: performance > Fix For: 2.6.0 > > Original Estimate: 96h > Remaining Estimate: 96h > > There is a business impact MR job which runs every day @ 2.00 PM PST and data > size is about 1 - 1.5 TB (depends on the business days) . Ideal elapsed time > of this job : 4 hrs. But the multiple mappers of this job simultaneously > failing with the following error so job will take some times 11 and even 13 > hours also
[jira] [Commented] (HADOOP-16023) Support system /etc/krb5.conf for auth_to_local rules
[ https://issues.apache.org/jira/browse/HADOOP-16023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734439#comment-16734439 ] Bolke de Bruin commented on HADOOP-16023: - This is the bug id for the jdk: [JDK-8216173|http://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8216173] > Support system /etc/krb5.conf for auth_to_local rules > - > > Key: HADOOP-16023 > URL: https://issues.apache.org/jira/browse/HADOOP-16023 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Bolke de Bruin >Assignee: Bolke de Bruin >Priority: Major > Labels: security > > Hadoop has long maintained its own configuration for Kerberos' auth_to_local > rules. To the user this is counter intuitive and increases the complexity of > maintaining a secure system as the normal way of configuring these > auth_to_local rules is done in the site wide krb5.conf usually /etc/krb5.conf. > With HADOOP-15996 there is now support for configuring how Hadoop should > evaluate auth_to_local rules. A "system" mechanism should be added. > It should be investigated how to properly parse krb5.conf. JDK seems to be > lacking as it is unable to obtain auth_to_local rules due to a bug in its > parser. Apache Kerby has an implementation that could be used. A native (C) > version is also a possibility. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16028) Fix NetworkTopology chooseRandom function to support excluded nodes
[ https://issues.apache.org/jira/browse/HADOOP-16028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734429#comment-16734429 ] Hudson commented on HADOOP-16028: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15705 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15705/]) HADOOP-16028. Fix NetworkTopology chooseRandom function to support (inigoiri: rev f4e18242bd8117a5c506ec6d3f25c85011fa82d0) * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java > Fix NetworkTopology chooseRandom function to support excluded nodes > --- > > Key: HADOOP-16028 > URL: https://issues.apache.org/jira/browse/HADOOP-16028 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.2 >Reporter: Sihai Ke >Assignee: Sihai Ke >Priority: Major > Fix For: 3.3.0 > > Attachments: 0001-add-UT-for-NetworkTopology.patch, > 0001-fix-NetworkTopology.java-chooseRandom-bug.patch, HDFS-14181.01.patch, > HDFS-14181.02.patch, HDFS-14181.03.patch, HDFS-14181.04.patch, > HDFS-14181.05.patch, HDFS-14181.06.patch, HDFS-14181.07.patch, > HDFS-14181.08.patch, HDFS-14181.09.patch, image-2018-12-29-15-02-19-415.png > > > During reading the hadoop NetworkTopology.java, I suspect there is a bug in > function > chooseRandom (line 498, hadoop version 2.9.2-RC0), > I think there is a bug in{color:#f79232} code, ~excludedScope doesn't mean > availableNodes under Scope node, and I also add unit test for this and get an > exception.{color} > bug code in the else. > {code:java} > // code placeholder > if (excludedScope == null) { > availableNodes = countNumOfAvailableNodes(scope, excludedNodes); > } else { > availableNodes = > countNumOfAvailableNodes("~" + excludedScope, excludedNodes); > }{code} > Source code: > {code:java} > // code placeholder > protected Node chooseRandom(final String scope, String excludedScope, > final Collection excludedNodes) { > if (excludedScope != null) { > if (scope.startsWith(excludedScope)) { > return null; > } > if (!excludedScope.startsWith(scope)) { > excludedScope = null; > } > } > Node node = getNode(scope); > if (!(node instanceof InnerNode)) { > return excludedNodes != null && excludedNodes.contains(node) ? > null : node; > } > InnerNode innerNode = (InnerNode)node; > int numOfDatanodes = innerNode.getNumOfLeaves(); > if (excludedScope == null) { > node = null; > } else { > node = getNode(excludedScope); > if (!(node instanceof InnerNode)) { > numOfDatanodes -= 1; > } else { > numOfDatanodes -= ((InnerNode)node).getNumOfLeaves(); > } > } > if (numOfDatanodes <= 0) { > LOG.debug("Failed to find datanode (scope=\"{}\" excludedScope=\"{}\")." > + " numOfDatanodes={}", > scope, excludedScope, numOfDatanodes); > return null; > } > final int availableNodes; > if (excludedScope == null) { > availableNodes = countNumOfAvailableNodes(scope, excludedNodes); > } else { > availableNodes = > countNumOfAvailableNodes("~" + excludedScope, excludedNodes); > } > LOG.debug("Choosing random from {} available nodes on node {}," > + " scope={}, excludedScope={}, excludeNodes={}. numOfDatanodes={}.", > availableNodes, innerNode, scope, excludedScope, excludedNodes, > numOfDatanodes); > Node ret = null; > if (availableNodes > 0) { > ret = chooseRandom(innerNode, node, excludedNodes, numOfDatanodes, > availableNodes); > } > LOG.debug("chooseRandom returning {}", ret); > return ret; > } > {code} > > > Add Unit Test in TestClusterTopology.java, but get exception. > > {code:java} > // code placeholder > @Test > public void testChooseRandom1() { > // create the topology > NetworkTopology cluster = NetworkTopology.getInstance(new Configuration()); > NodeElement node1 = getNewNode("node1", "/a1/b1/c1"); > cluster.add(node1); > NodeElement node2 = getNewNode("node2", "/a1/b1/c1"); > cluster.add(node2); > NodeElement node3 = getNewNode("node3", "/a1/b1/c2"); > cluster.add(node3); > NodeElement node4 = getNewNode("node4", "/a1/b2/c3"); > cluster.add(node4); > Node node = cluster.chooseRandom("/a1/b1", "/a1/b1/c1", null); > assertSame(node.getName(), "node3"); > } > {code} > > Exception: > {code:java} > // code placeholder > java.lang.IllegalArgumentException: 1 should >= 2, and both should be > positive. > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:567) > at >
[jira] [Commented] (HADOOP-16028) Fix NetworkTopology chooseRandom function to support excluded nodes
[ https://issues.apache.org/jira/browse/HADOOP-16028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734413#comment-16734413 ] Íñigo Goiri commented on HADOOP-16028: -- Thanks [~sihai] for the fix. Committed it to trunk. > Fix NetworkTopology chooseRandom function to support excluded nodes > --- > > Key: HADOOP-16028 > URL: https://issues.apache.org/jira/browse/HADOOP-16028 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.2 >Reporter: Sihai Ke >Assignee: Sihai Ke >Priority: Major > Attachments: 0001-add-UT-for-NetworkTopology.patch, > 0001-fix-NetworkTopology.java-chooseRandom-bug.patch, HDFS-14181.01.patch, > HDFS-14181.02.patch, HDFS-14181.03.patch, HDFS-14181.04.patch, > HDFS-14181.05.patch, HDFS-14181.06.patch, HDFS-14181.07.patch, > HDFS-14181.08.patch, HDFS-14181.09.patch, image-2018-12-29-15-02-19-415.png > > > During reading the hadoop NetworkTopology.java, I suspect there is a bug in > function > chooseRandom (line 498, hadoop version 2.9.2-RC0), > I think there is a bug in{color:#f79232} code, ~excludedScope doesn't mean > availableNodes under Scope node, and I also add unit test for this and get an > exception.{color} > bug code in the else. > {code:java} > // code placeholder > if (excludedScope == null) { > availableNodes = countNumOfAvailableNodes(scope, excludedNodes); > } else { > availableNodes = > countNumOfAvailableNodes("~" + excludedScope, excludedNodes); > }{code} > Source code: > {code:java} > // code placeholder > protected Node chooseRandom(final String scope, String excludedScope, > final Collection excludedNodes) { > if (excludedScope != null) { > if (scope.startsWith(excludedScope)) { > return null; > } > if (!excludedScope.startsWith(scope)) { > excludedScope = null; > } > } > Node node = getNode(scope); > if (!(node instanceof InnerNode)) { > return excludedNodes != null && excludedNodes.contains(node) ? > null : node; > } > InnerNode innerNode = (InnerNode)node; > int numOfDatanodes = innerNode.getNumOfLeaves(); > if (excludedScope == null) { > node = null; > } else { > node = getNode(excludedScope); > if (!(node instanceof InnerNode)) { > numOfDatanodes -= 1; > } else { > numOfDatanodes -= ((InnerNode)node).getNumOfLeaves(); > } > } > if (numOfDatanodes <= 0) { > LOG.debug("Failed to find datanode (scope=\"{}\" excludedScope=\"{}\")." > + " numOfDatanodes={}", > scope, excludedScope, numOfDatanodes); > return null; > } > final int availableNodes; > if (excludedScope == null) { > availableNodes = countNumOfAvailableNodes(scope, excludedNodes); > } else { > availableNodes = > countNumOfAvailableNodes("~" + excludedScope, excludedNodes); > } > LOG.debug("Choosing random from {} available nodes on node {}," > + " scope={}, excludedScope={}, excludeNodes={}. numOfDatanodes={}.", > availableNodes, innerNode, scope, excludedScope, excludedNodes, > numOfDatanodes); > Node ret = null; > if (availableNodes > 0) { > ret = chooseRandom(innerNode, node, excludedNodes, numOfDatanodes, > availableNodes); > } > LOG.debug("chooseRandom returning {}", ret); > return ret; > } > {code} > > > Add Unit Test in TestClusterTopology.java, but get exception. > > {code:java} > // code placeholder > @Test > public void testChooseRandom1() { > // create the topology > NetworkTopology cluster = NetworkTopology.getInstance(new Configuration()); > NodeElement node1 = getNewNode("node1", "/a1/b1/c1"); > cluster.add(node1); > NodeElement node2 = getNewNode("node2", "/a1/b1/c1"); > cluster.add(node2); > NodeElement node3 = getNewNode("node3", "/a1/b1/c2"); > cluster.add(node3); > NodeElement node4 = getNewNode("node4", "/a1/b2/c3"); > cluster.add(node4); > Node node = cluster.chooseRandom("/a1/b1", "/a1/b1/c1", null); > assertSame(node.getName(), "node3"); > } > {code} > > Exception: > {code:java} > // code placeholder > java.lang.IllegalArgumentException: 1 should >= 2, and both should be > positive. > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:567) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:544) > atorg.apache.hadoop.net.TestClusterTopology.testChooseRandom1(TestClusterTopology.java:198) > {code} > > {color:#f79232}!image-2018-12-29-15-02-19-415.png!{color} > > > [~vagarychen] this change is imported in PR HDFS-11577, could you help to > check whether this is a bug ? > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HADOOP-16028) Fix NetworkTopology chooseRandom function to support excluded nodes
[ https://issues.apache.org/jira/browse/HADOOP-16028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HADOOP-16028: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.3.0 Status: Resolved (was: Patch Available) > Fix NetworkTopology chooseRandom function to support excluded nodes > --- > > Key: HADOOP-16028 > URL: https://issues.apache.org/jira/browse/HADOOP-16028 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.2 >Reporter: Sihai Ke >Assignee: Sihai Ke >Priority: Major > Fix For: 3.3.0 > > Attachments: 0001-add-UT-for-NetworkTopology.patch, > 0001-fix-NetworkTopology.java-chooseRandom-bug.patch, HDFS-14181.01.patch, > HDFS-14181.02.patch, HDFS-14181.03.patch, HDFS-14181.04.patch, > HDFS-14181.05.patch, HDFS-14181.06.patch, HDFS-14181.07.patch, > HDFS-14181.08.patch, HDFS-14181.09.patch, image-2018-12-29-15-02-19-415.png > > > During reading the hadoop NetworkTopology.java, I suspect there is a bug in > function > chooseRandom (line 498, hadoop version 2.9.2-RC0), > I think there is a bug in{color:#f79232} code, ~excludedScope doesn't mean > availableNodes under Scope node, and I also add unit test for this and get an > exception.{color} > bug code in the else. > {code:java} > // code placeholder > if (excludedScope == null) { > availableNodes = countNumOfAvailableNodes(scope, excludedNodes); > } else { > availableNodes = > countNumOfAvailableNodes("~" + excludedScope, excludedNodes); > }{code} > Source code: > {code:java} > // code placeholder > protected Node chooseRandom(final String scope, String excludedScope, > final Collection excludedNodes) { > if (excludedScope != null) { > if (scope.startsWith(excludedScope)) { > return null; > } > if (!excludedScope.startsWith(scope)) { > excludedScope = null; > } > } > Node node = getNode(scope); > if (!(node instanceof InnerNode)) { > return excludedNodes != null && excludedNodes.contains(node) ? > null : node; > } > InnerNode innerNode = (InnerNode)node; > int numOfDatanodes = innerNode.getNumOfLeaves(); > if (excludedScope == null) { > node = null; > } else { > node = getNode(excludedScope); > if (!(node instanceof InnerNode)) { > numOfDatanodes -= 1; > } else { > numOfDatanodes -= ((InnerNode)node).getNumOfLeaves(); > } > } > if (numOfDatanodes <= 0) { > LOG.debug("Failed to find datanode (scope=\"{}\" excludedScope=\"{}\")." > + " numOfDatanodes={}", > scope, excludedScope, numOfDatanodes); > return null; > } > final int availableNodes; > if (excludedScope == null) { > availableNodes = countNumOfAvailableNodes(scope, excludedNodes); > } else { > availableNodes = > countNumOfAvailableNodes("~" + excludedScope, excludedNodes); > } > LOG.debug("Choosing random from {} available nodes on node {}," > + " scope={}, excludedScope={}, excludeNodes={}. numOfDatanodes={}.", > availableNodes, innerNode, scope, excludedScope, excludedNodes, > numOfDatanodes); > Node ret = null; > if (availableNodes > 0) { > ret = chooseRandom(innerNode, node, excludedNodes, numOfDatanodes, > availableNodes); > } > LOG.debug("chooseRandom returning {}", ret); > return ret; > } > {code} > > > Add Unit Test in TestClusterTopology.java, but get exception. > > {code:java} > // code placeholder > @Test > public void testChooseRandom1() { > // create the topology > NetworkTopology cluster = NetworkTopology.getInstance(new Configuration()); > NodeElement node1 = getNewNode("node1", "/a1/b1/c1"); > cluster.add(node1); > NodeElement node2 = getNewNode("node2", "/a1/b1/c1"); > cluster.add(node2); > NodeElement node3 = getNewNode("node3", "/a1/b1/c2"); > cluster.add(node3); > NodeElement node4 = getNewNode("node4", "/a1/b2/c3"); > cluster.add(node4); > Node node = cluster.chooseRandom("/a1/b1", "/a1/b1/c1", null); > assertSame(node.getName(), "node3"); > } > {code} > > Exception: > {code:java} > // code placeholder > java.lang.IllegalArgumentException: 1 should >= 2, and both should be > positive. > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:567) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:544) > atorg.apache.hadoop.net.TestClusterTopology.testChooseRandom1(TestClusterTopology.java:198) > {code} > > {color:#f79232}!image-2018-12-29-15-02-19-415.png!{color} > > > [~vagarychen] this change is imported in PR HDFS-11577, could you help to > check whether this is a bug ? > --
[jira] [Assigned] (HADOOP-16028) Fix NetworkTopology chooseRandom function to support excluded nodes
[ https://issues.apache.org/jira/browse/HADOOP-16028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri reassigned HADOOP-16028: Assignee: Sihai Ke > Fix NetworkTopology chooseRandom function to support excluded nodes > --- > > Key: HADOOP-16028 > URL: https://issues.apache.org/jira/browse/HADOOP-16028 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.2 >Reporter: Sihai Ke >Assignee: Sihai Ke >Priority: Major > Attachments: 0001-add-UT-for-NetworkTopology.patch, > 0001-fix-NetworkTopology.java-chooseRandom-bug.patch, HDFS-14181.01.patch, > HDFS-14181.02.patch, HDFS-14181.03.patch, HDFS-14181.04.patch, > HDFS-14181.05.patch, HDFS-14181.06.patch, HDFS-14181.07.patch, > HDFS-14181.08.patch, HDFS-14181.09.patch, image-2018-12-29-15-02-19-415.png > > > During reading the hadoop NetworkTopology.java, I suspect there is a bug in > function > chooseRandom (line 498, hadoop version 2.9.2-RC0), > I think there is a bug in{color:#f79232} code, ~excludedScope doesn't mean > availableNodes under Scope node, and I also add unit test for this and get an > exception.{color} > bug code in the else. > {code:java} > // code placeholder > if (excludedScope == null) { > availableNodes = countNumOfAvailableNodes(scope, excludedNodes); > } else { > availableNodes = > countNumOfAvailableNodes("~" + excludedScope, excludedNodes); > }{code} > Source code: > {code:java} > // code placeholder > protected Node chooseRandom(final String scope, String excludedScope, > final Collection excludedNodes) { > if (excludedScope != null) { > if (scope.startsWith(excludedScope)) { > return null; > } > if (!excludedScope.startsWith(scope)) { > excludedScope = null; > } > } > Node node = getNode(scope); > if (!(node instanceof InnerNode)) { > return excludedNodes != null && excludedNodes.contains(node) ? > null : node; > } > InnerNode innerNode = (InnerNode)node; > int numOfDatanodes = innerNode.getNumOfLeaves(); > if (excludedScope == null) { > node = null; > } else { > node = getNode(excludedScope); > if (!(node instanceof InnerNode)) { > numOfDatanodes -= 1; > } else { > numOfDatanodes -= ((InnerNode)node).getNumOfLeaves(); > } > } > if (numOfDatanodes <= 0) { > LOG.debug("Failed to find datanode (scope=\"{}\" excludedScope=\"{}\")." > + " numOfDatanodes={}", > scope, excludedScope, numOfDatanodes); > return null; > } > final int availableNodes; > if (excludedScope == null) { > availableNodes = countNumOfAvailableNodes(scope, excludedNodes); > } else { > availableNodes = > countNumOfAvailableNodes("~" + excludedScope, excludedNodes); > } > LOG.debug("Choosing random from {} available nodes on node {}," > + " scope={}, excludedScope={}, excludeNodes={}. numOfDatanodes={}.", > availableNodes, innerNode, scope, excludedScope, excludedNodes, > numOfDatanodes); > Node ret = null; > if (availableNodes > 0) { > ret = chooseRandom(innerNode, node, excludedNodes, numOfDatanodes, > availableNodes); > } > LOG.debug("chooseRandom returning {}", ret); > return ret; > } > {code} > > > Add Unit Test in TestClusterTopology.java, but get exception. > > {code:java} > // code placeholder > @Test > public void testChooseRandom1() { > // create the topology > NetworkTopology cluster = NetworkTopology.getInstance(new Configuration()); > NodeElement node1 = getNewNode("node1", "/a1/b1/c1"); > cluster.add(node1); > NodeElement node2 = getNewNode("node2", "/a1/b1/c1"); > cluster.add(node2); > NodeElement node3 = getNewNode("node3", "/a1/b1/c2"); > cluster.add(node3); > NodeElement node4 = getNewNode("node4", "/a1/b2/c3"); > cluster.add(node4); > Node node = cluster.chooseRandom("/a1/b1", "/a1/b1/c1", null); > assertSame(node.getName(), "node3"); > } > {code} > > Exception: > {code:java} > // code placeholder > java.lang.IllegalArgumentException: 1 should >= 2, and both should be > positive. > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:567) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:544) > atorg.apache.hadoop.net.TestClusterTopology.testChooseRandom1(TestClusterTopology.java:198) > {code} > > {color:#f79232}!image-2018-12-29-15-02-19-415.png!{color} > > > [~vagarychen] this change is imported in PR HDFS-11577, could you help to > check whether this is a bug ? > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To
[jira] [Commented] (HADOOP-15992) JSON License is included in the transitive dependency of aliyun-sdk-oss 3.0.0
[ https://issues.apache.org/jira/browse/HADOOP-15992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734403#comment-16734403 ] Hadoop QA commented on HADOOP-15992: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 9s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s{color} | {color:green} hadoop-project in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s{color} | {color:green} hadoop-aliyun in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 93m 58s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HADOOP-15992 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12953771/HADOOP-15992.02.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux
[jira] [Assigned] (HADOOP-16028) Fix NetworkTopology chooseRandom function to support excluded nodes
[ https://issues.apache.org/jira/browse/HADOOP-16028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri reassigned HADOOP-16028: Assignee: (was: Sihai Ke) Affects Version/s: (was: 2.9.2) 2.9.2 Component/s: (was: namenode) (was: hdfs) Key: HADOOP-16028 (was: HDFS-14181) Project: Hadoop Common (was: Hadoop HDFS) > Fix NetworkTopology chooseRandom function to support excluded nodes > --- > > Key: HADOOP-16028 > URL: https://issues.apache.org/jira/browse/HADOOP-16028 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.2 >Reporter: Sihai Ke >Priority: Major > Attachments: 0001-add-UT-for-NetworkTopology.patch, > 0001-fix-NetworkTopology.java-chooseRandom-bug.patch, HDFS-14181.01.patch, > HDFS-14181.02.patch, HDFS-14181.03.patch, HDFS-14181.04.patch, > HDFS-14181.05.patch, HDFS-14181.06.patch, HDFS-14181.07.patch, > HDFS-14181.08.patch, HDFS-14181.09.patch, image-2018-12-29-15-02-19-415.png > > > During reading the hadoop NetworkTopology.java, I suspect there is a bug in > function > chooseRandom (line 498, hadoop version 2.9.2-RC0), > I think there is a bug in{color:#f79232} code, ~excludedScope doesn't mean > availableNodes under Scope node, and I also add unit test for this and get an > exception.{color} > bug code in the else. > {code:java} > // code placeholder > if (excludedScope == null) { > availableNodes = countNumOfAvailableNodes(scope, excludedNodes); > } else { > availableNodes = > countNumOfAvailableNodes("~" + excludedScope, excludedNodes); > }{code} > Source code: > {code:java} > // code placeholder > protected Node chooseRandom(final String scope, String excludedScope, > final Collection excludedNodes) { > if (excludedScope != null) { > if (scope.startsWith(excludedScope)) { > return null; > } > if (!excludedScope.startsWith(scope)) { > excludedScope = null; > } > } > Node node = getNode(scope); > if (!(node instanceof InnerNode)) { > return excludedNodes != null && excludedNodes.contains(node) ? > null : node; > } > InnerNode innerNode = (InnerNode)node; > int numOfDatanodes = innerNode.getNumOfLeaves(); > if (excludedScope == null) { > node = null; > } else { > node = getNode(excludedScope); > if (!(node instanceof InnerNode)) { > numOfDatanodes -= 1; > } else { > numOfDatanodes -= ((InnerNode)node).getNumOfLeaves(); > } > } > if (numOfDatanodes <= 0) { > LOG.debug("Failed to find datanode (scope=\"{}\" excludedScope=\"{}\")." > + " numOfDatanodes={}", > scope, excludedScope, numOfDatanodes); > return null; > } > final int availableNodes; > if (excludedScope == null) { > availableNodes = countNumOfAvailableNodes(scope, excludedNodes); > } else { > availableNodes = > countNumOfAvailableNodes("~" + excludedScope, excludedNodes); > } > LOG.debug("Choosing random from {} available nodes on node {}," > + " scope={}, excludedScope={}, excludeNodes={}. numOfDatanodes={}.", > availableNodes, innerNode, scope, excludedScope, excludedNodes, > numOfDatanodes); > Node ret = null; > if (availableNodes > 0) { > ret = chooseRandom(innerNode, node, excludedNodes, numOfDatanodes, > availableNodes); > } > LOG.debug("chooseRandom returning {}", ret); > return ret; > } > {code} > > > Add Unit Test in TestClusterTopology.java, but get exception. > > {code:java} > // code placeholder > @Test > public void testChooseRandom1() { > // create the topology > NetworkTopology cluster = NetworkTopology.getInstance(new Configuration()); > NodeElement node1 = getNewNode("node1", "/a1/b1/c1"); > cluster.add(node1); > NodeElement node2 = getNewNode("node2", "/a1/b1/c1"); > cluster.add(node2); > NodeElement node3 = getNewNode("node3", "/a1/b1/c2"); > cluster.add(node3); > NodeElement node4 = getNewNode("node4", "/a1/b2/c3"); > cluster.add(node4); > Node node = cluster.chooseRandom("/a1/b1", "/a1/b1/c1", null); > assertSame(node.getName(), "node3"); > } > {code} > > Exception: > {code:java} > // code placeholder > java.lang.IllegalArgumentException: 1 should >= 2, and both should be > positive. > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:567) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:544) > atorg.apache.hadoop.net.TestClusterTopology.testChooseRandom1(TestClusterTopology.java:198) > {code} > > {color:#f79232}!image-2018-12-29-15-02-19-415.png!{color} > >
[jira] [Commented] (HADOOP-15229) Add FileSystem builder-based openFile() API to match createFile()
[ https://issues.apache.org/jira/browse/HADOOP-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734396#comment-16734396 ] Steve Loughran commented on HADOOP-15229: - OK. here's a question: what is logged in the AWS S3 logs on a request, and does it include any of the SQL statement? I ask as in the security bit of the docs I've added the words "GDPR" alongside security, and how if the statements include PII then they'd better not be printed > Add FileSystem builder-based openFile() API to match createFile() > - > > Key: HADOOP-15229 > URL: https://issues.apache.org/jira/browse/HADOOP-15229 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, fs/azure, fs/s3 >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15229-001.patch, HADOOP-15229-002.patch, > HADOOP-15229-003.patch, HADOOP-15229-004.patch, HADOOP-15229-004.patch, > HADOOP-15229-005.patch, HADOOP-15229-006.patch, HADOOP-15229-007.patch, > HADOOP-15229-009.patch, HADOOP-15229-010.patch, HADOOP-15229-011.patch, > HADOOP-15229-012.patch, HADOOP-15229-013.patch, HADOOP-15229-014.patch, > HADOOP-15229-015.patch > > > Replicate HDFS-1170 and HADOOP-14365 with an API to open files. > A key requirement of this is not HDFS, it's to put in the fadvise policy for > working with object stores, where getting the decision to do a full GET and > TCP abort on seek vs smaller GETs is fundamentally different: the wrong > option can cost you minutes. S3A and Azure both have adaptive policies now > (first backward seek), but they still don't do it that well. > Columnar formats (ORC, Parquet) should be able to say "fs.input.fadvise" > "random" as an option when they open files; I can imagine other options too. > The Builder model of [~eddyxu] is the one to mimic, method for method. > Ideally with as much code reuse as possible -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16027) [DOC] Effective use of FS instances during S3A integration tests
[ https://issues.apache.org/jira/browse/HADOOP-16027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734321#comment-16734321 ] Adam Antal commented on HADOOP-16027: - Thanks for the patch [~gabor.bota], it look good, as [~ste...@apache.org] said. I'll add one minor thing: I'd rather add the word manually here {code:java} Do NOT add manually `FileSystem` instances (...) to the cache {code} (instead of {code:java} Do NOT add `FileSystem` instances (...) to the cache {code} ) because many FS instances are actually added to the cache - but just as the they're closed, they got removed from it. > [DOC] Effective use of FS instances during S3A integration tests > > > Key: HADOOP-16027 > URL: https://issues.apache.org/jira/browse/HADOOP-16027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-16027.001.patch > > > While fixing HADOOP-15819 we found that a closed fs got into the static fs > cache during testing, which caused other tests to fail when the tests were > running sequentially. > We should document some best practices in the testing section on the s3 docs > with the following: > {panel} > Tests using FileSystems are fastest if they can recycle the existing FS > instance from the same JVM. If you do that, you MUST NOT close or do unique > configuration on them. If you want a guarantee of 100% isolation or an > instance with unique config, create a new instance > which you MUST close in the teardown to avoid leakage of resources. > Do not add FileSystem instances (with e.g > org.apache.hadoop.fs.FileSystem#addFileSystemForTesting) to the cache that > will be modified or closed during the test runs. This can cause other tests > to fail when using the same modified or closed FS instance. For more details > see HADOOP-15819. > {panel} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14445) Use DelegationTokenIssuer to create KMS delegation tokens that can authenticate to all KMS instances
[ https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734316#comment-16734316 ] Hudson commented on HADOOP-14445: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15703 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15703/]) HADOOP-15997. KMS client uses wrong UGI after HADOOP-14445. Contributed (sunilg: rev 51427cbdfb39cb6f5774b7b70009d7ee4388edfc) * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/kms/TestLoadBalancingKMSClientProvider.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/KMSClientProvider.java > Use DelegationTokenIssuer to create KMS delegation tokens that can > authenticate to all KMS instances > > > Key: HADOOP-14445 > URL: https://issues.apache.org/jira/browse/HADOOP-14445 > Project: Hadoop Common > Issue Type: Bug > Components: kms >Affects Versions: 2.8.0, 3.0.0-alpha1 > Environment: CDH5.7.4, Kerberized, SSL, KMS-HA, at rest encryption >Reporter: Wei-Chiu Chuang >Assignee: Xiao Chen >Priority: Major > Fix For: 3.2.0, 3.0.4, 3.1.2 > > Attachments: HADOOP-14445-branch-2.8.002.patch, > HADOOP-14445-branch-2.8.patch, HADOOP-14445.002.patch, > HADOOP-14445.003.patch, HADOOP-14445.004.patch, HADOOP-14445.05.patch, > HADOOP-14445.06.patch, HADOOP-14445.07.patch, HADOOP-14445.08.patch, > HADOOP-14445.09.patch, HADOOP-14445.10.patch, HADOOP-14445.11.patch, > HADOOP-14445.12.patch, HADOOP-14445.13.patch, HADOOP-14445.14.patch, > HADOOP-14445.15.patch, HADOOP-14445.16.patch, HADOOP-14445.17.patch, > HADOOP-14445.18.patch, HADOOP-14445.19.patch, HADOOP-14445.20.patch, > HADOOP-14445.addemdum.patch, HADOOP-14445.branch-2.000.precommit.patch, > HADOOP-14445.branch-2.001.precommit.patch, HADOOP-14445.branch-2.01.patch, > HADOOP-14445.branch-2.02.patch, HADOOP-14445.branch-2.03.patch, > HADOOP-14445.branch-2.04.patch, HADOOP-14445.branch-2.05.patch, > HADOOP-14445.branch-2.06.patch, HADOOP-14445.branch-2.8.003.patch, > HADOOP-14445.branch-2.8.004.patch, HADOOP-14445.branch-2.8.005.patch, > HADOOP-14445.branch-2.8.006.patch, HADOOP-14445.branch-2.8.revert.patch, > HADOOP-14445.branch-3.0.001.patch, HADOOP-14445.compat.patch, > HADOOP-14445.revert.patch > > > As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do > not share delegation tokens. (a client uses KMS address/port as the key for > delegation token) > {code:title=DelegationTokenAuthenticatedURL#openConnection} > if (!creds.getAllTokens().isEmpty()) { > InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(), > url.getPort()); > Text service = SecurityUtil.buildTokenService(serviceAddr); > dToken = creds.getToken(service); > {code} > But KMS doc states: > {quote} > Delegation Tokens > Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation > tokens too. > Under HA, A KMS instance must verify the delegation token given by another > KMS instance, by checking the shared secret used to sign the delegation > token. To do this, all KMS instances must be able to retrieve the shared > secret from ZooKeeper. > {quote} > We should either update the KMS documentation, or fix this code to share > delegation tokens. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15997) KMS client uses wrong UGI after HADOOP-14445
[ https://issues.apache.org/jira/browse/HADOOP-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734315#comment-16734315 ] Hudson commented on HADOOP-15997: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15703 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15703/]) HADOOP-15997. KMS client uses wrong UGI after HADOOP-14445. Contributed (sunilg: rev 51427cbdfb39cb6f5774b7b70009d7ee4388edfc) * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/kms/TestLoadBalancingKMSClientProvider.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/KMSClientProvider.java > KMS client uses wrong UGI after HADOOP-14445 > > > Key: HADOOP-15997 > URL: https://issues.apache.org/jira/browse/HADOOP-15997 > Project: Hadoop Common > Issue Type: Bug > Components: kms >Affects Versions: 3.2.0, 3.0.4, 3.1.2 > Environment: Hadoop 3.0.x (CDH6.x), Kerberized, HDFS at-rest > encryption, multiple KMS >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Blocker > Fix For: 3.2.0, 3.3.0, 3.1.2 > > Attachments: HADOOP-15997.001.patch, HADOOP-15997.02.patch > > > After HADOOP-14445, KMS client always authenticates itself using the > credentials from login user, rather than current user. > {noformat} > 2018-12-07 15:58:30,663 DEBUG [main] > org.apache.hadoop.crypto.key.kms.KMSClientProvider: Using loginUser when > Kerberos is enabled but the actual user does not have either KMS Delegation > Token or Kerberos Credentials > {noformat} > The log message {{"Using loginUser when Kerberos is enabled but the actual > user does not have either KMS Delegation Token or Kerberos Credentials"}} is > printed because {{KMSClientProvider#containsKmsDt()}} is null when it > definitely has the kms delegation token. > In fact, {{KMSClientProvider#containsKmsDt()}} should select delegation token > using {{clientTokenProvider.selectDelegationToken(creds)}} rather than > checking if its dtService is in the user credentials. > This is done correctly in {{KMSClientProvider#createAuthenticatedURL}} though. > We found this bug when it broke Cloudera's Backup and Disaster Recovery tool. > > [~daryn] [~xiaochen] mind taking a look? HADOOP-14445 is a huge patch but it > is almost perfect except for this bug. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16025) Update the year to 2019
[ https://issues.apache.org/jira/browse/HADOOP-16025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated HADOOP-16025: Fix Version/s: (was: 3.2.1) 3.2.0 > Update the year to 2019 > --- > > Key: HADOOP-16025 > URL: https://issues.apache.org/jira/browse/HADOOP-16025 > Project: Hadoop Common > Issue Type: Task > Components: build >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Fix For: 3.2.0, 2.7.8, 3.0.4, 2.8.6, 2.9.3, 3.1.3 > > Attachments: HADOOP-16025-01.patch > > > Update the year to 2019 from 2018. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15997) KMS client uses wrong UGI after HADOOP-14445
[ https://issues.apache.org/jira/browse/HADOOP-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734301#comment-16734301 ] Sunil Govindan commented on HADOOP-15997: - Thanks. Committing this shortly > KMS client uses wrong UGI after HADOOP-14445 > > > Key: HADOOP-15997 > URL: https://issues.apache.org/jira/browse/HADOOP-15997 > Project: Hadoop Common > Issue Type: Bug > Components: kms >Affects Versions: 3.2.0, 3.0.4, 3.1.2 > Environment: Hadoop 3.0.x (CDH6.x), Kerberized, HDFS at-rest > encryption, multiple KMS >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Blocker > Attachments: HADOOP-15997.001.patch, HADOOP-15997.02.patch > > > After HADOOP-14445, KMS client always authenticates itself using the > credentials from login user, rather than current user. > {noformat} > 2018-12-07 15:58:30,663 DEBUG [main] > org.apache.hadoop.crypto.key.kms.KMSClientProvider: Using loginUser when > Kerberos is enabled but the actual user does not have either KMS Delegation > Token or Kerberos Credentials > {noformat} > The log message {{"Using loginUser when Kerberos is enabled but the actual > user does not have either KMS Delegation Token or Kerberos Credentials"}} is > printed because {{KMSClientProvider#containsKmsDt()}} is null when it > definitely has the kms delegation token. > In fact, {{KMSClientProvider#containsKmsDt()}} should select delegation token > using {{clientTokenProvider.selectDelegationToken(creds)}} rather than > checking if its dtService is in the user credentials. > This is done correctly in {{KMSClientProvider#createAuthenticatedURL}} though. > We found this bug when it broke Cloudera's Backup and Disaster Recovery tool. > > [~daryn] [~xiaochen] mind taking a look? HADOOP-14445 is a huge patch but it > is almost perfect except for this bug. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15992) JSON License is included in the transitive dependency of aliyun-sdk-oss 3.0.0
[ https://issues.apache.org/jira/browse/HADOOP-15992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734300#comment-16734300 ] Sunil Govindan commented on HADOOP-15992: - Hi [~ajisakaa] Attaching a new revert from top of trunk after fixing the conflict. Could you please help to review and commit. Thanks > JSON License is included in the transitive dependency of aliyun-sdk-oss 3.0.0 > - > > Key: HADOOP-15992 > URL: https://issues.apache.org/jira/browse/HADOOP-15992 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.2 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Blocker > Attachments: HADOOP-15992.01.patch, HADOOP-15992.02.patch > > > This is the output of {{mvn dependency:tree}} > {noformat} > [INFO] +- com.aliyun.oss:aliyun-sdk-oss:jar:3.0.0:compile > [INFO] | +- org.jdom:jdom:jar:1.1:compile > [INFO] | +- com.sun.jersey:jersey-json:jar:1.19:compile > [INFO] | | +- org.codehaus.jettison:jettison:jar:1.1:compile > [INFO] | | +- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:compile > [INFO] | | +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile > [INFO] | | +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile > [INFO] | | +- org.codehaus.jackson:jackson-jaxrs:jar:1.9.13:compile > [INFO] | | \- org.codehaus.jackson:jackson-xc:jar:1.9.13:compile > [INFO] | +- com.aliyun:aliyun-java-sdk-core:jar:3.4.0:compile > [INFO] | | \- org.json:json:jar:20170516:compile > {noformat} > The license of org.json:json:jar:20170516:compile is JSON License, which > cannot be included. > https://www.apache.org/legal/resolved.html#json -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15997) KMS client uses wrong UGI after HADOOP-14445
[ https://issues.apache.org/jira/browse/HADOOP-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated HADOOP-15997: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.2 3.3.0 3.2.0 Status: Resolved (was: Patch Available) Thanks [~jojochuang] Committed to trunk/branch-3.2/branch-3.1 Cherrypick is failing for branch-3.0. Could you please help. Thanks > KMS client uses wrong UGI after HADOOP-14445 > > > Key: HADOOP-15997 > URL: https://issues.apache.org/jira/browse/HADOOP-15997 > Project: Hadoop Common > Issue Type: Bug > Components: kms >Affects Versions: 3.2.0, 3.0.4, 3.1.2 > Environment: Hadoop 3.0.x (CDH6.x), Kerberized, HDFS at-rest > encryption, multiple KMS >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Blocker > Fix For: 3.2.0, 3.3.0, 3.1.2 > > Attachments: HADOOP-15997.001.patch, HADOOP-15997.02.patch > > > After HADOOP-14445, KMS client always authenticates itself using the > credentials from login user, rather than current user. > {noformat} > 2018-12-07 15:58:30,663 DEBUG [main] > org.apache.hadoop.crypto.key.kms.KMSClientProvider: Using loginUser when > Kerberos is enabled but the actual user does not have either KMS Delegation > Token or Kerberos Credentials > {noformat} > The log message {{"Using loginUser when Kerberos is enabled but the actual > user does not have either KMS Delegation Token or Kerberos Credentials"}} is > printed because {{KMSClientProvider#containsKmsDt()}} is null when it > definitely has the kms delegation token. > In fact, {{KMSClientProvider#containsKmsDt()}} should select delegation token > using {{clientTokenProvider.selectDelegationToken(creds)}} rather than > checking if its dtService is in the user credentials. > This is done correctly in {{KMSClientProvider#createAuthenticatedURL}} though. > We found this bug when it broke Cloudera's Backup and Disaster Recovery tool. > > [~daryn] [~xiaochen] mind taking a look? HADOOP-14445 is a huge patch but it > is almost perfect except for this bug. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15992) JSON License is included in the transitive dependency of aliyun-sdk-oss 3.0.0
[ https://issues.apache.org/jira/browse/HADOOP-15992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated HADOOP-15992: Attachment: HADOOP-15992.02.patch > JSON License is included in the transitive dependency of aliyun-sdk-oss 3.0.0 > - > > Key: HADOOP-15992 > URL: https://issues.apache.org/jira/browse/HADOOP-15992 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.9.2 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Blocker > Attachments: HADOOP-15992.01.patch, HADOOP-15992.02.patch > > > This is the output of {{mvn dependency:tree}} > {noformat} > [INFO] +- com.aliyun.oss:aliyun-sdk-oss:jar:3.0.0:compile > [INFO] | +- org.jdom:jdom:jar:1.1:compile > [INFO] | +- com.sun.jersey:jersey-json:jar:1.19:compile > [INFO] | | +- org.codehaus.jettison:jettison:jar:1.1:compile > [INFO] | | +- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:compile > [INFO] | | +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile > [INFO] | | +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile > [INFO] | | +- org.codehaus.jackson:jackson-jaxrs:jar:1.9.13:compile > [INFO] | | \- org.codehaus.jackson:jackson-xc:jar:1.9.13:compile > [INFO] | +- com.aliyun:aliyun-java-sdk-core:jar:3.4.0:compile > [INFO] | | \- org.json:json:jar:20170516:compile > {noformat} > The license of org.json:json:jar:20170516:compile is JSON License, which > cannot be included. > https://www.apache.org/legal/resolved.html#json -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16027) [DOC] Effective use of FS instances during S3A integration tests
[ https://issues.apache.org/jira/browse/HADOOP-16027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734295#comment-16734295 ] Hadoop QA commented on HADOOP-16027: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 33m 28s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 41s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | HADOOP-16027 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12953767/HADOOP-16027.001.patch | | Optional Tests | dupname asflicense mvnsite | | uname | Linux 5603d804cd2b 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8c6978c | | maven | version: Apache Maven 3.3.9 | | Max. process+thread count | 306 (vs. ulimit of 1) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/15723/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > [DOC] Effective use of FS instances during S3A integration tests > > > Key: HADOOP-16027 > URL: https://issues.apache.org/jira/browse/HADOOP-16027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-16027.001.patch > > > While fixing HADOOP-15819 we found that a closed fs got into the static fs > cache during testing, which caused other tests to fail when the tests were > running sequentially. > We should document some best practices in the testing section on the s3 docs > with the following: > {panel} > Tests using FileSystems are fastest if they can recycle the existing FS > instance from the same JVM. If you do that, you MUST NOT close or do unique > configuration on them. If you want a guarantee of 100% isolation or an > instance with unique config, create a new instance > which you MUST close in the teardown to avoid leakage of resources. > Do not add FileSystem instances (with e.g > org.apache.hadoop.fs.FileSystem#addFileSystemForTesting) to the cache that > will be modified or closed during the test runs. This can cause other tests > to fail when using the same modified or closed FS instance. For more details > see HADOOP-15819. > {panel} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16027) [DOC] Effective use of FS instances during S3A integration tests
[ https://issues.apache.org/jira/browse/HADOOP-16027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota updated HADOOP-16027: Attachment: HADOOP-16027.001.patch > [DOC] Effective use of FS instances during S3A integration tests > > > Key: HADOOP-16027 > URL: https://issues.apache.org/jira/browse/HADOOP-16027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-16027.001.patch > > > While fixing HADOOP-15819 we found that a closed fs got into the static fs > cache during testing, which caused other tests to fail when the tests were > running sequentially. > We should document some best practices in the testing section on the s3 docs > with the following: > {panel} > Tests using FileSystems are fastest if they can recycle the existing FS > instance from the same JVM. If you do that, you MUST NOT close or do unique > configuration on them. If you want a guarantee of 100% isolation or an > instance with unique config, create a new instance > which you MUST close in the teardown to avoid leakage of resources. > Do not add FileSystem instances (with e.g > org.apache.hadoop.fs.FileSystem#addFileSystemForTesting) to the cache that > will be modified or closed during the test runs. This can cause other tests > to fail when using the same modified or closed FS instance. For more details > see HADOOP-15819. > {panel} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16027) [DOC] Effective use of FS instances during S3A integration tests
[ https://issues.apache.org/jira/browse/HADOOP-16027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota updated HADOOP-16027: Status: Patch Available (was: In Progress) Submitted patch v1. Just added a few lines to the doc, no need for an additional test (justification). > [DOC] Effective use of FS instances during S3A integration tests > > > Key: HADOOP-16027 > URL: https://issues.apache.org/jira/browse/HADOOP-16027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-16027.001.patch > > > While fixing HADOOP-15819 we found that a closed fs got into the static fs > cache during testing, which caused other tests to fail when the tests were > running sequentially. > We should document some best practices in the testing section on the s3 docs > with the following: > {panel} > Tests using FileSystems are fastest if they can recycle the existing FS > instance from the same JVM. If you do that, you MUST NOT close or do unique > configuration on them. If you want a guarantee of 100% isolation or an > instance with unique config, create a new instance > which you MUST close in the teardown to avoid leakage of resources. > Do not add FileSystem instances (with e.g > org.apache.hadoop.fs.FileSystem#addFileSystemForTesting) to the cache that > will be modified or closed during the test runs. This can cause other tests > to fail when using the same modified or closed FS instance. For more details > see HADOOP-15819. > {panel} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work started] (HADOOP-16027) [DOC] Effective use of FS instances during S3A integration tests
[ https://issues.apache.org/jira/browse/HADOOP-16027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HADOOP-16027 started by Gabor Bota. --- > [DOC] Effective use of FS instances during S3A integration tests > > > Key: HADOOP-16027 > URL: https://issues.apache.org/jira/browse/HADOOP-16027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > > While fixing HADOOP-15819 we found that a closed fs got into the static fs > cache during testing, which caused other tests to fail when the tests were > running sequentially. > We should document some best practices in the testing section on the s3 docs > with the following: > {panel} > Tests using FileSystems are fastest if they can recycle the existing FS > instance from the same JVM. If you do that, you MUST NOT close or do unique > configuration on them. If you want a guarantee of 100% isolation or an > instance with unique config, create a new instance > which you MUST close in the teardown to avoid leakage of resources. > Do not add FileSystem instances (with e.g > org.apache.hadoop.fs.FileSystem#addFileSystemForTesting) to the cache that > will be modified or closed during the test runs. This can cause other tests > to fail when using the same modified or closed FS instance. For more details > see HADOOP-15819. > {panel} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work started] (HADOOP-15999) [s3a] Better support for out-of-band operations
[ https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HADOOP-15999 started by Gabor Bota. --- > [s3a] Better support for out-of-band operations > --- > > Key: HADOOP-15999 > URL: https://issues.apache.org/jira/browse/HADOOP-15999 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Sean Mackrory >Assignee: Gabor Bota >Priority: Major > Attachments: out-of-band-operations.patch > > > S3Guard was initially done on the premise that a new MetadataStore would be > the source of truth, and that it wouldn't provide guarantees if updates were > done without using S3Guard. > I've been seeing increased demand for better support for scenarios where > operations are done on the data that can't reasonably be done with S3Guard > involved. For example: > * A file is deleted using S3Guard, and replaced by some other tool. S3Guard > can't tell the difference between the new file and delete / list > inconsistency and continues to treat the file as deleted. > * An S3Guard-ed file is overwritten by a longer file by some other tool. When > reading the file, only the length of the original file is read. > We could possibly have smarter behavior here by querying both S3 and the > MetadataStore (even in cases where we may currently only query the > MetadataStore in getFileStatus) and use whichever one has the higher modified > time. > This kills the performance boost we currently get in some workloads with the > short-circuited getFileStatus, but we could keep it with authoritative mode > which should give a larger performance boost. At least we'd get more > correctness without authoritative mode and a clear declaration of when we can > make the assumptions required to short-circuit the process. If we can't > consider S3Guard the source of truth, we need to defer to S3 more. > We'd need to be extra sure of any locality / time zone issues if we start > relying on mod_time more directly, but currently we're tracking the > modification time as returned by S3 anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16027) [DOC] Effective use of FS instances during S3A integration tests
Gabor Bota created HADOOP-16027: --- Summary: [DOC] Effective use of FS instances during S3A integration tests Key: HADOOP-16027 URL: https://issues.apache.org/jira/browse/HADOOP-16027 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Reporter: Gabor Bota Assignee: Gabor Bota While fixing HADOOP-15819 we found that a closed fs got into the static fs cache during testing, which caused other tests to fail when the tests were running sequentially. We should document some best practices in the testing section on the s3 docs with the following: {panel} Tests using FileSystems are fastest if they can recycle the existing FS instance from the same JVM. If you do that, you MUST NOT close or do unique configuration on them. If you want a guarantee of 100% isolation or an instance with unique config, create a new instance which you MUST close in the teardown to avoid leakage of resources. Do not add FileSystem instances (with e.g org.apache.hadoop.fs.FileSystem#addFileSystemForTesting) to the cache that will be modified or closed during the test runs. This can cause other tests to fail when using the same modified or closed FS instance. For more details see HADOOP-15819. {panel} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15819) S3A integration test failures: FileSystem is closed! - without parallel test run
[ https://issues.apache.org/jira/browse/HADOOP-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734141#comment-16734141 ] Gabor Bota commented on HADOOP-15819: - Created HADOOP-16027 to add this to docs. > S3A integration test failures: FileSystem is closed! - without parallel test > run > > > Key: HADOOP-15819 > URL: https://issues.apache.org/jira/browse/HADOOP-15819 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 3.1.1 >Reporter: Gabor Bota >Assignee: Adam Antal >Priority: Critical > Attachments: HADOOP-15819.000.patch, HADOOP-15819.001.patch, > HADOOP-15819.002.patch, S3ACloseEnforcedFileSystem.java, > S3ACloseEnforcedFileSystem.java, closed_fs_closers_example_5klines.log.zip > > > Running the integration tests for hadoop-aws {{mvn -Dscale verify}} against > Amazon AWS S3 (eu-west-1, us-west-1, with no s3guard) we see a lot of these > failures: > {noformat} > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.408 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob) > Time elapsed: 0.027 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 4.345 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob > [ERROR] > testStagingDirectory(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.021 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.022 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.489 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest) > Time elapsed: 0.023 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.695 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob > [ERROR] testMRJob(org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob) > Time elapsed: 0.039 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.015 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory > [ERROR] > testEverything(org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory) > Time elapsed: 0.014 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > {noformat} > The big issue is that the tests are running in a serial manner - no test is > running on top of the other - so we should not see that the tests are failing > like this. The issue could be in how we handle > org.apache.hadoop.fs.FileSystem#CACHE - the tests should use the same > S3AFileSystem so if A test uses a FileSystem and closes it in teardown then B > test will get the same FileSystem object from the cache and try to use it, > but it is closed. > We see this a lot in our downstream testing too. It's not possible to tell > that the failed regression test result is an implementation issue in the > runtime code or a test implementation problem. > I've checked when and what closes the S3AFileSystem with a sightly modified > version of S3AFileSystem which logs the closers of the fs if an error should > occur. I'll attach this modified java file for reference. See the next > example of the result when it's running: > {noformat} > 2018-10-04 00:52:25,596 [Thread-4201] ERROR s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:checkIfClosed(74)) - Use after close(): > java.lang.RuntimeException: Using closed FS!. > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.checkIfClosed(S3ACloseEnforcedFileSystem.java:73) > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.mkdirs(S3ACloseEnforcedFileSystem.java:474) > at > org.apache.hadoop.fs.contract.AbstractFSContractTestBase.mkdirs(AbstractFSContractTestBase.java:338) > at > org.apache.hadoop.fs.contract.AbstractFSContractTestBase.setup(AbstractFSContractTestBase.java:193) > at >
[jira] [Commented] (HADOOP-15999) [s3a] Better support for out-of-band operations
[ https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734126#comment-16734126 ] Gabor Bota commented on HADOOP-15999: - I'm going to fix geFileStatus: if S3Guard is not in authoritative mode, we should check S3 for filestatus as well. If the metadata from S3 is more recent than what we have on the MS, we should update the MS and return the fresher metadata from S3. I will extend this with inconsistency detection metrics in HADOOP-15779. Also, if we are not running in authoritative mode, all operations will be slower. If MS is authoritative, then we don't have to read S3. Adding more config knobs to this would make this thing over-complicated. There should be an fsck op to sync things (you are right [~ste...@apache.org]), so if a customer runs with S3Guard auth mode and let's say they made some out-of-band operation like a delete without ms, there's a way to sync with the tool. Note: if there's an out-of-band operation right now it's easier to remove the whole ms with prune then handling ms ddb records one by one. > [s3a] Better support for out-of-band operations > --- > > Key: HADOOP-15999 > URL: https://issues.apache.org/jira/browse/HADOOP-15999 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Sean Mackrory >Assignee: Gabor Bota >Priority: Major > Attachments: out-of-band-operations.patch > > > S3Guard was initially done on the premise that a new MetadataStore would be > the source of truth, and that it wouldn't provide guarantees if updates were > done without using S3Guard. > I've been seeing increased demand for better support for scenarios where > operations are done on the data that can't reasonably be done with S3Guard > involved. For example: > * A file is deleted using S3Guard, and replaced by some other tool. S3Guard > can't tell the difference between the new file and delete / list > inconsistency and continues to treat the file as deleted. > * An S3Guard-ed file is overwritten by a longer file by some other tool. When > reading the file, only the length of the original file is read. > We could possibly have smarter behavior here by querying both S3 and the > MetadataStore (even in cases where we may currently only query the > MetadataStore in getFileStatus) and use whichever one has the higher modified > time. > This kills the performance boost we currently get in some workloads with the > short-circuited getFileStatus, but we could keep it with authoritative mode > which should give a larger performance boost. At least we'd get more > correctness without authoritative mode and a clear declaration of when we can > make the assumptions required to short-circuit the process. If we can't > consider S3Guard the source of truth, we need to defer to S3 more. > We'd need to be extra sure of any locality / time zone issues if we start > relying on mod_time more directly, but currently we're tracking the > modification time as returned by S3 anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-16019) ZKDelegationTokenSecretManager won't log exception message occured in function setJaasConfiguration
[ https://issues.apache.org/jira/browse/HADOOP-16019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734060#comment-16734060 ] Steve Loughran edited comment on HADOOP-16019 at 1/4/19 11:41 AM: -- checkstyle is line too long, please break I'd prefer you just use + {code} ("Could not Load ZK acls or aut: " + ex, ex) {code} Why? Not all exceptions have a getMessage value (e.g. NullPointerException). was (Author: ste...@apache.org): checkstyle is line too long, please break Given the exception is being logged, no need for the +ex.getMessage() on the string. If you really want it, I'd prefer ex.toString (not all exceptions have a message, with NPE being the key example). You can go {code} LOG.warn("text {}", e, e) {code} and have the toString invoked only on demand. > ZKDelegationTokenSecretManager won't log exception message occured in > function setJaasConfiguration > --- > > Key: HADOOP-16019 > URL: https://issues.apache.org/jira/browse/HADOOP-16019 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.1.0 >Reporter: luhuachao >Assignee: luhuachao >Priority: Minor > Attachments: HADOOP-16019.1.patch, HADOOP-16019.2.patch > > > * when the config ZK_DTSM_ZK_KERBEROS_KEYTAB or > ZK_DTSM_ZK_KERBEROS_PRINCIPAL are not set, the IllegalArgumentException > message cannot be logged. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-16019) ZKDelegationTokenSecretManager won't log exception message occured in function setJaasConfiguration
[ https://issues.apache.org/jira/browse/HADOOP-16019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran reassigned HADOOP-16019: --- Assignee: luhuachao > ZKDelegationTokenSecretManager won't log exception message occured in > function setJaasConfiguration > --- > > Key: HADOOP-16019 > URL: https://issues.apache.org/jira/browse/HADOOP-16019 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.1.0 >Reporter: luhuachao >Assignee: luhuachao >Priority: Minor > Attachments: HADOOP-16019.1.patch, HADOOP-16019.2.patch > > > * when the config ZK_DTSM_ZK_KERBEROS_KEYTAB or > ZK_DTSM_ZK_KERBEROS_PRINCIPAL are not set, the IllegalArgumentException > message cannot be logged. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16019) ZKDelegationTokenSecretManager won't log exception message occured in function setJaasConfiguration
[ https://issues.apache.org/jira/browse/HADOOP-16019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734060#comment-16734060 ] Steve Loughran commented on HADOOP-16019: - checkstyle is line too long, please break Given the exception is being logged, no need for the +ex.getMessage() on the string. If you really want it, I'd prefer ex.toString (not all exceptions have a message, with NPE being the key example). You can go {code} LOG.warn("text {}", e, e) {code} and have the toString invoked only on demand. > ZKDelegationTokenSecretManager won't log exception message occured in > function setJaasConfiguration > --- > > Key: HADOOP-16019 > URL: https://issues.apache.org/jira/browse/HADOOP-16019 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.1.0 >Reporter: luhuachao >Priority: Minor > Attachments: HADOOP-16019.1.patch, HADOOP-16019.2.patch > > > * when the config ZK_DTSM_ZK_KERBEROS_KEYTAB or > ZK_DTSM_ZK_KERBEROS_PRINCIPAL are not set, the IllegalArgumentException > message cannot be logged. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org