[jira] [Updated] (HADOOP-15565) ViewFileSystem.close doesn't close child filesystems and causes FileSystem objects leak.
[ https://issues.apache.org/jira/browse/HADOOP-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HADOOP-15565: - Attachment: HADOOP-15565.0001.patch Status: Patch Available (was: Open) > ViewFileSystem.close doesn't close child filesystems and causes FileSystem > objects leak. > > > Key: HADOOP-15565 > URL: https://issues.apache.org/jira/browse/HADOOP-15565 > Project: Hadoop Common > Issue Type: Bug >Reporter: Jinglun >Priority: Major > Attachments: HADOOP-15565.0001.patch > > > When we create a ViewFileSystem, all it's child filesystems will be cached by > FileSystem.CACHE. Unless we close these child filesystems, they will stay in > FileSystem.CACHE forever. > I think we should let FileSystem.CACHE cache ViewFileSystem only, and let > ViewFileSystem cache all it's child filesystems. So we can close > ViewFileSystem without leak and won't affect other ViewFileSystems. > I find this problem because i need to re-login my kerberos and renew > ViewFileSystem periodically. Because FileSystem.CACHE.Key is based on > UserGroupInformation, which changes everytime i re-login, I can't use the > cached child filesystems when i new a ViewFileSystem. And because > ViewFileSystem.close does nothing but remove itself from cache, i leak all > it's child filesystems in cache. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15565) ViewFileSystem.close doesn't close child filesystems and causes FileSystem objects leak.
Jinglun created HADOOP-15565: Summary: ViewFileSystem.close doesn't close child filesystems and causes FileSystem objects leak. Key: HADOOP-15565 URL: https://issues.apache.org/jira/browse/HADOOP-15565 Project: Hadoop Common Issue Type: Bug Reporter: Jinglun When we create a ViewFileSystem, all it's child filesystems will be cached by FileSystem.CACHE. Unless we close these child filesystems, they will stay in FileSystem.CACHE forever. I think we should let FileSystem.CACHE cache ViewFileSystem only, and let ViewFileSystem cache all it's child filesystems. So we can close ViewFileSystem without leak and won't affect other ViewFileSystems. I find this problem because i need to re-login my kerberos and renew ViewFileSystem periodically. Because FileSystem.CACHE.Key is based on UserGroupInformation, which changes everytime i re-login, I can't use the cached child filesystems when i new a ViewFileSystem. And because ViewFileSystem.close does nothing but remove itself from cache, i leak all it's child filesystems in cache. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15559) Clarity on Spark compatibility with hadoop-aws
[ https://issues.apache.org/jira/browse/HADOOP-15559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524469#comment-16524469 ] Nicholas Chammas edited comment on HADOOP-15559 at 6/27/18 2:27 AM: Hi [~ste...@apache.org] and thank you for the thorough response and references. 1. Is [the s3a troubleshooting guide|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md] published anywhere? Or is the GitHub URL the canonical URL? I feel like [S3 Support in Apache Hadoop|https://wiki.apache.org/hadoop/AmazonS3] is the most visible bit of documentation about s3a. It would make sense to link to the troubleshooting guide from there. 2. In my case, I am not adding the AWS SDK individually. By using {{pyspark --packages}} (or {{spark-submit --packages}}) with hadoop-aws, I understand that Spark automatically pulls transitive dependencies for me. So my focus has been to just get the mapping of Spark version to hadoop-aws version correct. Additionally, I am trying really hard to stick to the default release builds of Spark, as opposed to building my own versions of Spark to use with [Flintrock|https://github.com/nchammas/flintrock]. Being able to spin Spark clusters up on EC2 by downloading Spark directly from the Apache mirror network means one less piece of infrastructure I have to maintain myself. So I'm trying not to get into the business of building Spark, though I am aware of {{-Phadoop-cloud}}. Thankfully, it looks like [Spark 2.3.1 built against Hadoop 2.7|http://archive.apache.org/dist/spark/spark-2.3.1/] works with {{–packages "org.apache.hadoop:hadoop-aws:2.7.6"}}, and I suppose according to your comment in SPARK-22919 that is basically the version of hadoop-aws I need to use with these releases as long as Spark is built against Hadoop 2.7. Does that sound about right to you? was (Author: nchammas): Hi [~ste...@apache.org] and thank you for the thorough response and references. # Is [the s3a troubleshooting guide|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md] published anywhere? Or is the GitHub URL the canonical URL? I feel like [S3 Support in Apache Hadoop|https://wiki.apache.org/hadoop/AmazonS3] is the most visible bit of documentation about s3a. It would make sense to link to the troubleshooting guide from there. # In my case, I am not adding the AWS SDK individually. By using {{pyspark --packages}} (or {{spark-submit --packages}}) with hadoop-aws, I understand that Spark automatically pulls transitive dependencies for me. So my focus has been to just get the mapping of Spark version to hadoop-aws version correct. Additionally, I am trying really hard to stick to the default release builds of Spark, as opposed to building my own versions of Spark to use with [Flintrock|https://github.com/nchammas/flintrock]. Being able to spin Spark clusters up on EC2 by downloading Spark directly from the Apache mirror network means one less piece of infrastructure I have to maintain myself. So I'm trying not to get into the business of building Spark, though I am aware of {{-Phadoop-cloud}}. Thankfully, it looks like [Spark 2.3.1 built against Hadoop 2.7|http://archive.apache.org/dist/spark/spark-2.3.1/] works with {{–packages "org.apache.hadoop:hadoop-aws:2.7.6"}}, and I suppose according to your comment in SPARK-22919 that is basically the version of hadoop-aws I need to use with these releases as long as Spark is built against Hadoop 2.7. Does that sound about right to you? > Clarity on Spark compatibility with hadoop-aws > -- > > Key: HADOOP-15559 > URL: https://issues.apache.org/jira/browse/HADOOP-15559 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation, fs/s3 >Reporter: Nicholas Chammas >Priority: Minor > > I'm the maintainer of [Flintrock|https://github.com/nchammas/flintrock], a > command-line tool for launching Apache Spark clusters on AWS. One of the > things I try to do for my users is make it straightforward to use Spark with > {{s3a://}}. I do this by recommending that users start Spark with the > {{hadoop-aws}} package. > For example: > {code:java} > pyspark --packages "org.apache.hadoop:hadoop-aws:2.8.4" > {code} > I'm struggling, however, to understand what versions of {{hadoop-aws}} should > work with what versions of Spark. > Spark releases are [built against Hadoop > 2.7|http://archive.apache.org/dist/spark/spark-2.3.1/]. At the same time, > I've been told that I should be able to use newer versions of Hadoop and > Hadoop libraries with Spark, so for example, running Spark built against > Hadoop 2.7 alongside HDFS 2.8 should work, and
[jira] [Commented] (HADOOP-15559) Clarity on Spark compatibility with hadoop-aws
[ https://issues.apache.org/jira/browse/HADOOP-15559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524469#comment-16524469 ] Nicholas Chammas commented on HADOOP-15559: --- Hi [~ste...@apache.org] and thank you for the thorough response and references. # Is [the s3a troubleshooting guide|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md] published anywhere? Or is the GitHub URL the canonical URL? I feel like [S3 Support in Apache Hadoop|https://wiki.apache.org/hadoop/AmazonS3] is the most visible bit of documentation about s3a. It would make sense to link to the troubleshooting guide from there. # In my case, I am not adding the AWS SDK individually. By using {{pyspark --packages}} (or {{spark-submit --packages}}) with hadoop-aws, I understand that Spark automatically pulls transitive dependencies for me. So my focus has been to just get the mapping of Spark version to hadoop-aws version correct. Additionally, I am trying really hard to stick to the default release builds of Spark, as opposed to building my own versions of Spark to use with [Flintrock|https://github.com/nchammas/flintrock]. Being able to spin Spark clusters up on EC2 by downloading Spark directly from the Apache mirror network means one less piece of infrastructure I have to maintain myself. So I'm trying not to get into the business of building Spark, though I am aware of {{-Phadoop-cloud}}. Thankfully, it looks like [Spark 2.3.1 built against Hadoop 2.7|http://archive.apache.org/dist/spark/spark-2.3.1/] works with {{–packages "org.apache.hadoop:hadoop-aws:2.7.6"}}, and I suppose according to your comment in SPARK-22919 that is basically the version of hadoop-aws I need to use with these releases as long as Spark is built against Hadoop 2.7. Does that sound about right to you? > Clarity on Spark compatibility with hadoop-aws > -- > > Key: HADOOP-15559 > URL: https://issues.apache.org/jira/browse/HADOOP-15559 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation, fs/s3 >Reporter: Nicholas Chammas >Priority: Minor > > I'm the maintainer of [Flintrock|https://github.com/nchammas/flintrock], a > command-line tool for launching Apache Spark clusters on AWS. One of the > things I try to do for my users is make it straightforward to use Spark with > {{s3a://}}. I do this by recommending that users start Spark with the > {{hadoop-aws}} package. > For example: > {code:java} > pyspark --packages "org.apache.hadoop:hadoop-aws:2.8.4" > {code} > I'm struggling, however, to understand what versions of {{hadoop-aws}} should > work with what versions of Spark. > Spark releases are [built against Hadoop > 2.7|http://archive.apache.org/dist/spark/spark-2.3.1/]. At the same time, > I've been told that I should be able to use newer versions of Hadoop and > Hadoop libraries with Spark, so for example, running Spark built against > Hadoop 2.7 alongside HDFS 2.8 should work, and there is [no need to build > Spark explicitly against Hadoop > 2.8|http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Spark-2-3-1-RC4-tp24087p24092.html]. > I'm having trouble translating this mental model into recommendations for how > to pair Spark with {{hadoop-aws}}. > For example, Spark 2.3.1 built against Hadoop 2.7 works with > {{hadoop-aws:2.7.6}} but not with {{hadoop-aws:2.8.4}}. Trying the latter > yields the following error when I try to access files via {{s3a://}}. > {code:java} > py4j.protocol.Py4JJavaError: An error occurred while calling o35.text. > : java.lang.IllegalAccessError: tried to access method > org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V > from class org.apache.hadoop.fs.s3a.S3AInstrumentation > at > org.apache.hadoop.fs.s3a.S3AInstrumentation.streamCounter(S3AInstrumentation.java:194) > at > org.apache.hadoop.fs.s3a.S3AInstrumentation.streamCounter(S3AInstrumentation.java:216) > at > org.apache.hadoop.fs.s3a.S3AInstrumentation.(S3AInstrumentation.java:139) > at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:174) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94) > at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) > at > org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:45) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:354) > at >
[jira] [Commented] (HADOOP-15495) Upgrade common-langs version to 3.7 in hadoop-common-project and hadoop-tools
[ https://issues.apache.org/jira/browse/HADOOP-15495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524456#comment-16524456 ] Akira Ajisaka commented on HADOOP-15495: LGTM, +1. Thanks [~tasanuma0829]! > Upgrade common-langs version to 3.7 in hadoop-common-project and hadoop-tools > - > > Key: HADOOP-15495 > URL: https://issues.apache.org/jira/browse/HADOOP-15495 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Attachments: HADOOP-15495.1.patch, HADOOP-15495.2.patch, > HADOOP-15495.3.patch, HADOOP-15495.4.patch > > > commons-lang 2.6 is widely used. Let's upgrade to 3.6. > This jira is separated from HADOOP-10783. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15124) Slow FileSystem.Statistics counters implementation
[ https://issues.apache.org/jira/browse/HADOOP-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524425#comment-16524425 ] Igor Dvorzhak commented on HADOOP-15124: If I will have a time I will test your patch to compare performance, but from looking at it, looks like AtomicLong-based implementation should be faster because it's lock-free. The main performance bottle-neck in my patch right now is access to ThreadLocal, but looks like nothing could be done about it. > Slow FileSystem.Statistics counters implementation > -- > > Key: HADOOP-15124 > URL: https://issues.apache.org/jira/browse/HADOOP-15124 > Project: Hadoop Common > Issue Type: Sub-task > Components: common >Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0, 3.1.0 >Reporter: Igor Dvorzhak >Assignee: Igor Dvorzhak >Priority: Major > Labels: common, filesystem, fs, statistics > Attachments: HADOOP-15124.001.patch > > > While profiling 1TB TeraGen job on Hadoop 2.8.2 cluster (Google Dataproc, 2 > workers, GCS connector) I saw that FileSystem.Statistics code paths Wall time > is 5.58% and CPU time is 26.5% of total execution time. > After switching FileSystem.Statistics implementation to LongAdder, consumed > Wall time decreased to 0.006% and CPU time to 0.104% of total execution time. > Total job runtime decreased from 66 mins to 61 mins. > These results are not conclusive, because I didn't benchmark multiple times > to average results, but regardless of performance gains switching to > LongAdder simplifies code and reduces its complexity. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15124) Slow FileSystem.Statistics counters implementation
[ https://issues.apache.org/jira/browse/HADOOP-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524425#comment-16524425 ] Igor Dvorzhak edited comment on HADOOP-15124 at 6/27/18 1:21 AM: - If I will have a time I will test your patch to compare performance, but from looking at it, seems that AtomicLong-based implementation should be faster because it's lock-free. The main performance bottle-neck in my patch right now is access to ThreadLocal, but looks like nothing could be done about it. was (Author: medb): If I will have a time I will test your patch to compare performance, but from looking at it, looks like AtomicLong-based implementation should be faster because it's lock-free. The main performance bottle-neck in my patch right now is access to ThreadLocal, but looks like nothing could be done about it. > Slow FileSystem.Statistics counters implementation > -- > > Key: HADOOP-15124 > URL: https://issues.apache.org/jira/browse/HADOOP-15124 > Project: Hadoop Common > Issue Type: Sub-task > Components: common >Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0, 3.1.0 >Reporter: Igor Dvorzhak >Assignee: Igor Dvorzhak >Priority: Major > Labels: common, filesystem, fs, statistics > Attachments: HADOOP-15124.001.patch > > > While profiling 1TB TeraGen job on Hadoop 2.8.2 cluster (Google Dataproc, 2 > workers, GCS connector) I saw that FileSystem.Statistics code paths Wall time > is 5.58% and CPU time is 26.5% of total execution time. > After switching FileSystem.Statistics implementation to LongAdder, consumed > Wall time decreased to 0.006% and CPU time to 0.104% of total execution time. > Total job runtime decreased from 66 mins to 61 mins. > These results are not conclusive, because I didn't benchmark multiple times > to average results, but regardless of performance gains switching to > LongAdder simplifies code and reduces its complexity. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15564) Classloading Shell should not run a subprocess
[ https://issues.apache.org/jira/browse/HADOOP-15564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524352#comment-16524352 ] Todd Lipcon commented on HADOOP-15564: -- One wrinkle is that Shell is marked as public, so removing the public isSetsidSupported member may be considered a breaking API change. > Classloading Shell should not run a subprocess > -- > > Key: HADOOP-15564 > URL: https://issues.apache.org/jira/browse/HADOOP-15564 > Project: Hadoop Common > Issue Type: Improvement > Components: util >Affects Versions: 3.0.0 >Reporter: Todd Lipcon >Priority: Major > > The 'Shell' class has a static member isSetsidSupported which, in order to > initialize, forks out a subprocess. Various other parts of the code reference > Shell.WINDOWS. For example, the StringUtils class has such a reference. This > means that, during startup, a seemingly fast call like > Configuration.getBoolean() ends up class-loading StringUtils, which > class-loads Shell, which forks out a subprocess. I couldn't measure any big > improvement by fixing this, but seemed surprising to say the least. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15564) Classloading Shell should not run a subprocess
Todd Lipcon created HADOOP-15564: Summary: Classloading Shell should not run a subprocess Key: HADOOP-15564 URL: https://issues.apache.org/jira/browse/HADOOP-15564 Project: Hadoop Common Issue Type: Improvement Components: util Affects Versions: 3.0.0 Reporter: Todd Lipcon The 'Shell' class has a static member isSetsidSupported which, in order to initialize, forks out a subprocess. Various other parts of the code reference Shell.WINDOWS. For example, the StringUtils class has such a reference. This means that, during startup, a seemingly fast call like Configuration.getBoolean() ends up class-loading StringUtils, which class-loads Shell, which forks out a subprocess. I couldn't measure any big improvement by fixing this, but seemed surprising to say the least. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15483) Upgrade jquery to version 3.3.1
[ https://issues.apache.org/jira/browse/HADOOP-15483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated HADOOP-15483: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.1 Status: Resolved (was: Patch Available) Committed to 3.1. Thanks [~msingh] > Upgrade jquery to version 3.3.1 > --- > > Key: HADOOP-15483 > URL: https://issues.apache.org/jira/browse/HADOOP-15483 > Project: Hadoop Common > Issue Type: Task >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Fix For: 3.2.0, 3.1.1 > > Attachments: HADOOP-15483-branch-3.1.001.patch, > HADOOP-15483.001.patch, HADOOP-15483.002.patch, HADOOP-15483.003.patch, > HADOOP-15483.004.patch, HADOOP-15483.005.patch, HADOOP-15483.006.patch, > HADOOP-15483.007.patch, HADOOP-15483.008.patch > > > This Jira aims to upgrade jquery to version 3.3.1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15518) Authentication filter calling handler after request already authenticated
[ https://issues.apache.org/jira/browse/HADOOP-15518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524265#comment-16524265 ] Vinod Kumar Vavilapalli commented on HADOOP-15518: -- bq. I have changed getUserPrincipal to getRemoteUser and this change seems working fine. That was my first solution too. [~kminder], would that work? > Authentication filter calling handler after request already authenticated > - > > Key: HADOOP-15518 > URL: https://issues.apache.org/jira/browse/HADOOP-15518 > Project: Hadoop Common > Issue Type: Bug > Components: security >Affects Versions: 2.7.1 >Reporter: Kevin Minder >Assignee: Kevin Minder >Priority: Major > Attachments: HADOOP-15518-001.patch > > > The hadoop-auth AuthenticationFilter will invoke its handler even if a prior > successful authentication has occurred in the current request. This > primarily affects situations where multiple authentication mechanism has been > configured. For example when core-site.xml's has > hadoop.http.authentication.type=kerberos and yarn-site.xml has > yarn.timeline-service.http-authentication.type=kerberos the result is an > attempt to perform two Kerberos authentications for the same request. This > in turn results in Kerberos triggering a replay attack detection. The > javadocs for AuthenticationHandler > ([https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationHandler.java)] > indicate for the authenticate method that > {quote}This method is invoked by the AuthenticationFilter only if the HTTP > client request is not yet authenticated. > {quote} > This does not appear to be the case in practice. > I've create a patch and tested on a limited number of functional use cases > (e.g. the timeline-service issue noted above). If there is general agreement > that the change is valid I'll add unit tests to the patch. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15552) Move logging APIs over to slf4j in hadoop-tools - Part2
[ https://issues.apache.org/jira/browse/HADOOP-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524250#comment-16524250 ] genericqa commented on HADOOP-15552: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 30 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 54s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 7m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 0s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 39s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 33s{color} | {color:green} root generated 0 new + 1557 unchanged - 6 fixed = 1557 total (was 1563) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 7m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 35s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 16s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 50s{color} | {color:green} hadoop-streaming in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 33s{color} | {color:green} hadoop-distcp in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 5s{color} | {color:green} hadoop-archives in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s{color} | {color:green} hadoop-archive-logs in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 37s{color} | {color:green} hadoop-rumen in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 17s{color} | {color:green} hadoop-gridmix in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s{color} | {color:green} hadoop-datajoin in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 3s{color} | {color:green} hadoop-extras in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} |
[jira] [Commented] (HADOOP-14624) Add GenericTestUtils.DelayAnswer that accept slf4j logger API
[ https://issues.apache.org/jira/browse/HADOOP-14624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524242#comment-16524242 ] genericqa commented on HADOOP-14624: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 13 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 52s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 35s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 28m 35s{color} | {color:red} root generated 1 new + 1561 unchanged - 2 fixed = 1562 total (was 1563) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 1 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 3s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 43s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 5s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 45s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}223m 22s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.hdfs.server.namenode.TestStartup | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HADOOP-14624 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12929234/HADOOP-14624.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ac0c9cb096c0 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 238fe00 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | |
[jira] [Created] (HADOOP-15563) s3guard init and set-capacity to support DDB autoscaling
Steve Loughran created HADOOP-15563: --- Summary: s3guard init and set-capacity to support DDB autoscaling Key: HADOOP-15563 URL: https://issues.apache.org/jira/browse/HADOOP-15563 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 3.1.0 Environment: To keep costs down on DDB, autoscaling is a key feature: you set the max values and when idle, you don't get billed, *at the cost of delayed scale time and risk of not getting the max value when AWS is busy* It can be done from the AWS web UI, but not in the s3guard init and set-capacity calls It can be done [through the API|https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AutoScaling.HowTo.SDK.html] Usual issues then: wiring up, CLI params, testing. It'll be hard to test. Reporter: Steve Loughran -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15483) Upgrade jquery to version 3.3.1
[ https://issues.apache.org/jira/browse/HADOOP-15483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524079#comment-16524079 ] Sunil Govindan commented on HADOOP-15483: - whitespace is from jquery.js which is external. Test case failure seems not related too. Committing this to 3.1 cc/ [~leftnoteasy] > Upgrade jquery to version 3.3.1 > --- > > Key: HADOOP-15483 > URL: https://issues.apache.org/jira/browse/HADOOP-15483 > Project: Hadoop Common > Issue Type: Task >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Fix For: 3.2.0 > > Attachments: HADOOP-15483-branch-3.1.001.patch, > HADOOP-15483.001.patch, HADOOP-15483.002.patch, HADOOP-15483.003.patch, > HADOOP-15483.004.patch, HADOOP-15483.005.patch, HADOOP-15483.006.patch, > HADOOP-15483.007.patch, HADOOP-15483.008.patch > > > This Jira aims to upgrade jquery to version 3.3.1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14624) Add GenericTestUtils.DelayAnswer that accept slf4j logger API
[ https://issues.apache.org/jira/browse/HADOOP-14624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Pickering updated HADOOP-14624: --- Attachment: HADOOP-14624.003.patch > Add GenericTestUtils.DelayAnswer that accept slf4j logger API > - > > Key: HADOOP-14624 > URL: https://issues.apache.org/jira/browse/HADOOP-14624 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Wenxin He >Assignee: Wenxin He >Priority: Major > Attachments: HADOOP-14624.001.patch, HADOOP-14624.002.patch, > HADOOP-14624.003.patch > > > Split from HADOOP-14539. > Now GenericTestUtils.DelayAnswer only accepts commons-logging logger API. Now > we are migrating the APIs to slf4j, slf4j logger API should be accepted as > well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14624) Add GenericTestUtils.DelayAnswer that accept slf4j logger API
[ https://issues.apache.org/jira/browse/HADOOP-14624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Pickering updated HADOOP-14624: --- Attachment: HADOOP-12956.003.patch > Add GenericTestUtils.DelayAnswer that accept slf4j logger API > - > > Key: HADOOP-14624 > URL: https://issues.apache.org/jira/browse/HADOOP-14624 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Wenxin He >Assignee: Wenxin He >Priority: Major > Attachments: HADOOP-14624.001.patch, HADOOP-14624.002.patch, > HADOOP-14624.003.patch > > > Split from HADOOP-14539. > Now GenericTestUtils.DelayAnswer only accepts commons-logging logger API. Now > we are migrating the APIs to slf4j, slf4j logger API should be accepted as > well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14624) Add GenericTestUtils.DelayAnswer that accept slf4j logger API
[ https://issues.apache.org/jira/browse/HADOOP-14624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Pickering updated HADOOP-14624: --- Attachment: (was: HADOOP-12956.003.patch) > Add GenericTestUtils.DelayAnswer that accept slf4j logger API > - > > Key: HADOOP-14624 > URL: https://issues.apache.org/jira/browse/HADOOP-14624 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Wenxin He >Assignee: Wenxin He >Priority: Major > Attachments: HADOOP-14624.001.patch, HADOOP-14624.002.patch, > HADOOP-14624.003.patch > > > Split from HADOOP-14539. > Now GenericTestUtils.DelayAnswer only accepts commons-logging logger API. Now > we are migrating the APIs to slf4j, slf4j logger API should be accepted as > well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15546) ABFS: tune imports & javadocs
[ https://issues.apache.org/jira/browse/HADOOP-15546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524061#comment-16524061 ] Steve Loughran commented on HADOOP-15546: - shadedclient is not something I'm looking at right now, we'll have to deal with at merge time to see if its a real issue > ABFS: tune imports & javadocs > - > > Key: HADOOP-15546 > URL: https://issues.apache.org/jira/browse/HADOOP-15546 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: HADOOP-15407 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15546-001.patch, > HADOOP-15546-HADOOP-15407-001.patch > > > Followup on HADOOP-15540 with some initial review tuning > * ordering of imports > * rely on azure-auth-keys.xml to store credentials (change imports, > docs,.gitignore) > * log4j -> info > * add a "." to the first sentence of all the javadocs I noticed. > * remove @Public annotations except for some constants (which includes some > commitment to maintain them). > * move the AbstractFS declarations out of the src/test/resources XML file > into core-default.xml for all to use > * other IDE-suggested tweaks > No actual code changes here; just setting things up better for >1 person > editing & testing -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15552) Move logging APIs over to slf4j in hadoop-tools - Part2
[ https://issues.apache.org/jira/browse/HADOOP-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523965#comment-16523965 ] Giovanni Matteo Fumarola commented on HADOOP-15552: --- Thanks [~ste...@apache.org] , we should exclude the Azure parts as you suggested in a previous comment. [~iapicker] can you update the patch to remove the changes in Azure package? > Move logging APIs over to slf4j in hadoop-tools - Part2 > --- > > Key: HADOOP-15552 > URL: https://issues.apache.org/jira/browse/HADOOP-15552 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Ian Pickering >Priority: Major > Attachments: HADOOP-15552.v1.patch, HADOOP-15552.v2.patch > > > Some classes in Hadoop-tools were not moved to slf4j > e.g. AliyunOSSInputStream.java, HadoopArchiveLogs.java, > HadoopArchiveLogsRunner.java -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15552) Move logging APIs over to slf4j in hadoop-tools - Part2
[ https://issues.apache.org/jira/browse/HADOOP-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523947#comment-16523947 ] Steve Loughran commented on HADOOP-15552: - OK, You are going near Azure, so I'm going to have to insist on the "declare which Azure endpoint you've run all the integration tests on" process. sorry. You'll have to get set up to run those tests & makes sure everything still works. > Move logging APIs over to slf4j in hadoop-tools - Part2 > --- > > Key: HADOOP-15552 > URL: https://issues.apache.org/jira/browse/HADOOP-15552 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Ian Pickering >Priority: Major > Attachments: HADOOP-15552.v1.patch, HADOOP-15552.v2.patch > > > Some classes in Hadoop-tools were not moved to slf4j > e.g. AliyunOSSInputStream.java, HadoopArchiveLogs.java, > HadoopArchiveLogsRunner.java -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15562) Add Software Load Balancing support for Hadoop services
[ https://issues.apache.org/jira/browse/HADOOP-15562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523941#comment-16523941 ] Íñigo Goiri commented on HADOOP-15562: -- This should support other services like: * The Router: we should be available to check the state of the Router. * The Observer Namenode: we should be able to redirect to both Active and Standby. > Add Software Load Balancing support for Hadoop services > --- > > Key: HADOOP-15562 > URL: https://issues.apache.org/jira/browse/HADOOP-15562 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Giovanni Matteo Fumarola >Priority: Minor > > We put HA services behind Software Load Balancers (SLBs), so the SLB only > redirects to the Active components (e.g., active Namenode). > SLBs usually rely on the return code of an endpoint to determine if a service > is available to serve requests. > Currently, one can use already existing interfaces like JMX to check the > status of the service. > However, there is a need to do the mapping between the JMX values and the > state of the service. > We should provide an interface (potentially REST) to check for particular JMX > (or a new ones) values and report a particular HTTP code (e.g., 200). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15562) Add Software Load Balancing support for Hadoop services
[ https://issues.apache.org/jira/browse/HADOOP-15562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HADOOP-15562: - Description: We put HA services behind Software Load Balancers (SLBs), so the SLB only redirects to the Active components (e.g., active Namenode). SLBs usually rely on the return code of an endpoint to determine if a service is available to serve requests. Currently, one can use already existing interfaces like JMX to check the status of the service. However, there is a need to do the mapping between the JMX values and the state of the service. We should provide an interface (potentially REST) to check for particular JMX (or a new ones) values and report a particular HTTP code (e.g., 200). was: Software Load Balancers usually rely on the return code of an endpoint to determine if a service is available to serve requests. Currently, one can use interfaces like JMX to check the status of the service. However, there is a need to do the mapping between the JMX values and the state of the service. We should provide an interface (potentially REST) to check for particular JMX (or a new ones) values and report a particular HTTP code (e.g., 200). > Add Software Load Balancing support for Hadoop services > --- > > Key: HADOOP-15562 > URL: https://issues.apache.org/jira/browse/HADOOP-15562 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Giovanni Matteo Fumarola >Priority: Minor > > We put HA services behind Software Load Balancers (SLBs), so the SLB only > redirects to the Active components (e.g., active Namenode). > SLBs usually rely on the return code of an endpoint to determine if a service > is available to serve requests. > Currently, one can use already existing interfaces like JMX to check the > status of the service. > However, there is a need to do the mapping between the JMX values and the > state of the service. > We should provide an interface (potentially REST) to check for particular JMX > (or a new ones) values and report a particular HTTP code (e.g., 200). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14624) Add GenericTestUtils.DelayAnswer that accept slf4j logger API
[ https://issues.apache.org/jira/browse/HADOOP-14624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523940#comment-16523940 ] Steve Loughran commented on HADOOP-14624: - I'm happy with a patch going in; looks like the work stuttered here but there's nothing blocking it at a larger scale. Updated patch? go for it —though Wenxin He will still get the credit for all their work > Add GenericTestUtils.DelayAnswer that accept slf4j logger API > - > > Key: HADOOP-14624 > URL: https://issues.apache.org/jira/browse/HADOOP-14624 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Wenxin He >Assignee: Wenxin He >Priority: Major > Attachments: HADOOP-14624.001.patch, HADOOP-14624.002.patch > > > Split from HADOOP-14539. > Now GenericTestUtils.DelayAnswer only accepts commons-logging logger API. Now > we are migrating the APIs to slf4j, slf4j logger API should be accepted as > well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15562) Add Software Load Balancing support for Hadoop services
[ https://issues.apache.org/jira/browse/HADOOP-15562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HADOOP-15562: - Summary: Add Software Load Balancing support for Hadoop services (was: Add Software Load Balancing support for Hadoop/Yarn Process) > Add Software Load Balancing support for Hadoop services > --- > > Key: HADOOP-15562 > URL: https://issues.apache.org/jira/browse/HADOOP-15562 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Giovanni Matteo Fumarola >Priority: Minor > > Software Load Balancers usually rely on the return code of an endpoint to > determine if a service is available to serve requests. > Currently, one can use interfaces like JMX to check the status of the service. > However, there is a need to do the mapping between the JMX values and the > state of the service. > We should provide an interface (potentially REST) to check for particular JMX > (or a new ones) values and report a particular HTTP code (e.g., 200). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15562) Add Software Load Balancing support for Hadoop/Yarn Process
[ https://issues.apache.org/jira/browse/HADOOP-15562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HADOOP-15562: - Description: Software Load Balancers usually rely on the return code of an endpoint to determine if a service is available to serve requests. Currently, one can use interfaces like JMX to check the status of the service. However, there is a need to do the mapping between the JMX values and the state of the service. We should provide an interface (potentially REST) to check for particular JMX (or a new ones) values and report a particular HTTP code (e.g., 200). was: Software Load Balancers usually rely on the return code of an endpoint to determine if a service is available to serve requests. Currently, one can use interfaces like JMX to check the status of the service. However, there is a need to do the mapping between the JMX values and the state of the service. We should provide an interface (potentially REST) to check for particular JMX (or a new ones) values and report a particular HTTP code (e.g., 202). > Add Software Load Balancing support for Hadoop/Yarn Process > --- > > Key: HADOOP-15562 > URL: https://issues.apache.org/jira/browse/HADOOP-15562 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Giovanni Matteo Fumarola >Priority: Minor > > Software Load Balancers usually rely on the return code of an endpoint to > determine if a service is available to serve requests. > Currently, one can use interfaces like JMX to check the status of the service. > However, there is a need to do the mapping between the JMX values and the > state of the service. > We should provide an interface (potentially REST) to check for particular JMX > (or a new ones) values and report a particular HTTP code (e.g., 200). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15562) Add Software Load Balancing support for Hadoop/Yarn Process
[ https://issues.apache.org/jira/browse/HADOOP-15562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523936#comment-16523936 ] Íñigo Goiri commented on HADOOP-15562: -- This would apply to the YARN Resource Manager and the HDFS NameNode. This JIRA should create the generic HTTP method that checks JMX in a generic way. Then we need two more JIRAs for the RM and the NN. > Add Software Load Balancing support for Hadoop/Yarn Process > --- > > Key: HADOOP-15562 > URL: https://issues.apache.org/jira/browse/HADOOP-15562 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Giovanni Matteo Fumarola >Priority: Minor > > Software Load Balancers usually rely on the return code of an endpoint to > determine if a service is available to serve requests. > Currently, one can use interfaces like JMX to check the status of the service. > However, there is a need to do the mapping between the JMX values and the > state of the service. > We should provide an interface (potentially REST) to check for particular JMX > (or a new ones) values and report a particular HTTP code (e.g., 202). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15124) Slow FileSystem.Statistics counters implementation
[ https://issues.apache.org/jira/browse/HADOOP-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523933#comment-16523933 ] Todd Lipcon commented on HADOOP-15124: -- Before seeing this JIRA I also happened to have spent some time on this same perf issue. My approach was just to micro-optimize the existing stats implementation: - use a simple array to iterate over the FS stats instead of iterating over a HashSet (the latter involves a much more complex iterator) - out-of-line the unlikely path for threadlocal (improve inlining) - get rid of the visitor abstraction for visiting stats objects (it wasn't getting escape-analyzed out or inlined, and was also causing actual boxing of Longs) In my teragen tests this also reduced the statistics to a small fraction of the profile. I didn't compare vs a LongAdder approach, though. My patch is at: https://github.com/toddlipcon/hadoop-common/commit/e5bedddabbb9e8729b2f58165f0849c30e2be346 > Slow FileSystem.Statistics counters implementation > -- > > Key: HADOOP-15124 > URL: https://issues.apache.org/jira/browse/HADOOP-15124 > Project: Hadoop Common > Issue Type: Sub-task > Components: common >Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0, 3.1.0 >Reporter: Igor Dvorzhak >Assignee: Igor Dvorzhak >Priority: Major > Labels: common, filesystem, fs, statistics > Attachments: HADOOP-15124.001.patch > > > While profiling 1TB TeraGen job on Hadoop 2.8.2 cluster (Google Dataproc, 2 > workers, GCS connector) I saw that FileSystem.Statistics code paths Wall time > is 5.58% and CPU time is 26.5% of total execution time. > After switching FileSystem.Statistics implementation to LongAdder, consumed > Wall time decreased to 0.006% and CPU time to 0.104% of total execution time. > Total job runtime decreased from 66 mins to 61 mins. > These results are not conclusive, because I didn't benchmark multiple times > to average results, but regardless of performance gains switching to > LongAdder simplifies code and reduces its complexity. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15562) Add Software Load Balancing support for Hadoop/Yarn Process
[ https://issues.apache.org/jira/browse/HADOOP-15562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HADOOP-15562: - Description: Software Load Balancers usually rely on the return code of an endpoint to determine if a service is available to serve requests. Currently, one can use interfaces like JMX to check the status of the service. However, there is a need to do the mapping between the JMX values and the state of the service. We should provide an interface (potentially REST) to check for particular JMX (or a new ones) values and report a particular HTTP code (e.g., 202). was:Software load balancers usually rely on the return > Add Software Load Balancing support for Hadoop/Yarn Process > --- > > Key: HADOOP-15562 > URL: https://issues.apache.org/jira/browse/HADOOP-15562 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Giovanni Matteo Fumarola >Priority: Minor > > Software Load Balancers usually rely on the return code of an endpoint to > determine if a service is available to serve requests. > Currently, one can use interfaces like JMX to check the status of the service. > However, there is a need to do the mapping between the JMX values and the > state of the service. > We should provide an interface (potentially REST) to check for particular JMX > (or a new ones) values and report a particular HTTP code (e.g., 202). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15562) Add Software Load Balancing support for Hadoop/Yarn Process
Giovanni Matteo Fumarola created HADOOP-15562: - Summary: Add Software Load Balancing support for Hadoop/Yarn Process Key: HADOOP-15562 URL: https://issues.apache.org/jira/browse/HADOOP-15562 Project: Hadoop Common Issue Type: Improvement Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15562) Add Software Load Balancing support for Hadoop/Yarn Process
[ https://issues.apache.org/jira/browse/HADOOP-15562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HADOOP-15562: - Description: Software load balancers usually rely on the return > Add Software Load Balancing support for Hadoop/Yarn Process > --- > > Key: HADOOP-15562 > URL: https://issues.apache.org/jira/browse/HADOOP-15562 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Giovanni Matteo Fumarola >Priority: Minor > > Software load balancers usually rely on the return -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12739) Deadlock with OrcInputFormat split threads and Jets3t connections, since, NativeS3FileSystem does not release connections with seek()
[ https://issues.apache.org/jira/browse/HADOOP-12739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-12739: Resolution: Won't Fix Status: Resolved (was: Patch Available) > Deadlock with OrcInputFormat split threads and Jets3t connections, since, > NativeS3FileSystem does not release connections with seek() > - > > Key: HADOOP-12739 > URL: https://issues.apache.org/jira/browse/HADOOP-12739 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 2.6.0, 2.7.0 >Reporter: Pavan Srinivas >Assignee: Pavan Srinivas >Priority: Major > Attachments: 11600.txt, HADOOP-12739.patch > > > Recently, we came across a deadlock situation with OrcInputFormat while > computing splits. > - In Orc, for split computation, it needs file listing and file sizes. > - Multiple threads are invoked for listing the files and if the data is > located in S3, NativeS3FileSystem is used. > - NativeS3FileSystem in turn uses JetS3t Lib to talk to AWS and maintain > connection pool. > - When # of threads from OrcInputFormat exceeds JetS3t's max # of > connections, a deadlock occurs. stack trace: > {code} > "ORC_GET_SPLITS #5" daemon prio=10 tid=0x7f8568108800 nid=0x1e29 in > Object.wait() [0x7f8565696000] >java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xdf9ed450> (a > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool) > at > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:518) > - locked <0xdf9ed450> (a > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool) > at > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416) > at > org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153) > at > org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) > at > org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) > at > org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:370) > at > org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestGet(RestStorageService.java:929) > at > org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2007) > at > org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:1944) > at org.jets3t.service.S3Service.getObject(S3Service.java:2625) > at > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:254) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at org.apache.hadoop.fs.s3native.$Proxy12.retrieve(Unknown Source) > at > org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.reopen(NativeS3FileSystem.java:269) > - locked <0xdb01eec0> (a > org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream) > at > org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:258) > - locked <0xdb01eec0> (a > org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream) > at > org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:98) > at > org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:63) > - locked <0xdb01ee70> (a org.apache.hadoop.fs.FSDataInputStream) > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:329) > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:292) > at > org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:197) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:857) > at >
[jira] [Commented] (HADOOP-15561) Property Azure Account Key Provider Key requires blob.core.windows.net
[ https://issues.apache.org/jira/browse/HADOOP-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523785#comment-16523785 ] Steve Loughran commented on HADOOP-15561: - happy to take a .patch for this, especially one which applies to hadoop branch-2+. For trunk HADOOP-14507 is going to bring some more dramatic changes, but as that's not merged in yet things should be OK > Property Azure Account Key Provider Key requires blob.core.windows.net > -- > > Key: HADOOP-15561 > URL: https://issues.apache.org/jira/browse/HADOOP-15561 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/azure >Affects Versions: 3.1.0 >Reporter: Santiago Velasco >Priority: Trivial > Labels: docuentation, newbie > Original Estimate: 5m > Remaining Estimate: 5m > > The documentation for Hadoop Azure Support: Azure Blob Storage > [http://hadoop.apache.org/docs/r3.1.0/hadoop-azure/index.html] > The properties under Protecting the Azure Credentials for WASB within an > Encrypted File > The property fs.azure.account.keyprovider.youraccount requires > _.blob.core.windows.net_ > Documentation reads: > > {code:java} > > fs.azure.account.keyprovider.youraccount > org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider > > {code} > Should read: > > {code:java} > > fs.azure.account.keyprovider.youraccount.blob.core.windows.net > org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15560) ABFS: removed dependency injection and unnecessary dependencies
[ https://issues.apache.org/jira/browse/HADOOP-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523782#comment-16523782 ] Steve Loughran commented on HADOOP-15560: - * remember to hit the "submit patch" button for jenkins to kick off its test run. * I'm waiting for a review of HADOOP-15546; I'd like to get that in before other major changes take place, as they are going to obsolete that work, I'll have to redo it, and we'll end up in a losing battle about stabilising things > ABFS: removed dependency injection and unnecessary dependencies > --- > > Key: HADOOP-15560 > URL: https://issues.apache.org/jira/browse/HADOOP-15560 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Da Zhou >Assignee: Da Zhou >Priority: Major > Attachments: HADOOP-15407-HADOOP-15407-009.patch > > > # Removed dependency injection and unnecessary dependencies. > # Added tool to clean up test containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15557) CryptoInputStream can't handle concurrent access; inconsistent with HDFS
[ https://issues.apache.org/jira/browse/HADOOP-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523774#comment-16523774 ] Steve Loughran commented on HADOOP-15557: - yes, its the forgivingness of hdfs which has led people to use it this way, even though the java.io docs very much say "do not use concurrently". > CryptoInputStream can't handle concurrent access; inconsistent with HDFS > > > Key: HADOOP-15557 > URL: https://issues.apache.org/jira/browse/HADOOP-15557 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 3.2.0 >Reporter: Todd Lipcon >Priority: Major > > In general, the non-positional read APIs for streams in Hadoop Common are > meant to be used by only a single thread at a time. It would not make much > sense to have concurrent multi-threaded access to seek+read because they > modify the stream's file position. Multi-threaded access on input streams can > be done using positional read APIs. Multi-threaded access on output streams > probably never makes sense. > In the case of DFSInputStream, the positional read APIs are marked > synchronized, so that even when misused, no strange exceptions are thrown. > The results are just somewhat undefined in that it's hard for a thread to > know which position was read from. However, when running on an encrypted file > system, the results are much worse: since CryptoInputStream's read methods > are not marked synchronized, the caller can get strange ByteBuffer exceptions > or even a JVM crash due to concurrent use and free of underlying OpenSSL > Cipher buffers. > The crypto stream wrappers should be made more resilient to such misuse, for > example by: > (a) making the read methods safer by making them synchronized (so they have > the same behavior as DFSInputStream) > or > (b) trying to detect concurrent access to these methods and throwing > ConcurrentModificationException so that the user is alerted to their probable > misuse. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15559) Clarity on Spark compatibility with hadoop-aws
[ https://issues.apache.org/jira/browse/HADOOP-15559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523741#comment-16523741 ] Steve Loughran edited comment on HADOOP-15559 at 6/26/18 1:51 PM: -- # We feel your pain. Getting everything synced up is hard, especially when Spark itself bumps up some dependencies incompatibly (SPARK-22919) # the latest docs on this topic [are here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md]; as they say {quote}Critical: Do not attempt to "drop in" a newer version of the AWS SDK than that which the Hadoop version was built with Whatever problem you have, changing the AWS SDK version will not fix things, only change the stack traces you see. {quote} {quote}Similarly, don't try and mix a hadoop-aws JAR from one Hadoop release with that of any other. The JAR must be in sync with hadoop-common and some other Hadoop JARs. {quote} {quote}Randomly changing hadoop- and aws- JARs in the hope of making a problem "go away" or to gain access to a feature you want, will not lead to the outcome you desire. {quote} We also point people at [mvnepo|http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws] for the normative list of hadoop-aws - aws JAR mapping. Getting an AWS SDK and Hadoop AWS binding together is not easy, and if you do try it, you are on your own, with nothing but JIRAs related to "upgrade AWS SDK" to act as a cue. Where life is hard is that unless you build spark with the -Phadoop-cloud profile, you don't get things all lined up. It is what it is for. Regarding your specific issue, unless you are using a release of spark with that cloud profile, you have to do it by hand. * Get the exact matching hadoop-aws JAR as hadoop-common. Same for hadoop-auth, hadoop-aws. You cannot mix them. * get the matching aws SDK JAR(s), using mvnrepo as your guide. * And jackson, obviously. FWIW Hadoop 2.9+ has moved to the shaded AWS SDK JAR to avoid a lot of this pain. * If you want to use Hadoop 2.8, unless your spark distribution has reverted SPARK-22919, downgrade the httpclient libraries (see the PR there for what changed) Returning to your complaint. anything else you can do the docs are welcome, though really, the best strategy would be to get spark releases built with that hadoop-cloud profile, which is intended to give you all the dependencies you need, and none of the ones you don't. was (Author: ste...@apache.org): # We feel your pain. Getting everything synced up is hard, especially when Spark itself bumps up some dependencies incompatibly (SPARK-22919) # the latest docs on this topic [are here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md]; as they say {quote}Critical: Do not attempt to "drop in" a newer version of the AWS SDK than that which the Hadoop version was built with Whatever problem you have, changing the AWS SDK version will not fix things, only change the stack traces you see. {quote} {quote}Similarly, don't try and mix a hadoop-aws JAR from one Hadoop release with that of any other. The JAR must be in sync with hadoop-common and some other Hadoop JARs. {quote} {quote}Randomly changing hadoop- and aws- JARs in the hope of making a problem "go away" or to gain access to a feature you want, will not lead to the outcome you desire. {quote} We also point people at [mvnepo|http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws] for the normative list of hadoop-aws - aws JAR mapping. Getting an AWS SDK and Hadoop AWS binding together is not easy, and if you do try it, you are on your own, with nothing but JIRAs related to "upgrade AWS SDK" to act as a cue. Where life is hard is that unless you build spark with the -Phadoop-cloud profile, you don't get things all lined up. It is what it is for. Regarding your specific issue, unless you are using a release of spark with that cloud profile, you have to do it by hand. * Get the exact matching hadoop-aws JAR as hadoop-common. Same for hadoop-auth, hadoop-aws. You cannot mix them. * get the matching aws SDK JAR(s), using mvnrepo as your guide. * And jackson, obviously. FWIW Hadoop 2.9+ has moved to the shaded AWS SDK JAR to avoid a lot of this pain. * If you want to use Hadoop 2.8, unless your spark distribution has reverted SPARK-22919, downgrade the httpclient libraries (see the PR there for what changed) Returning you complaint. anything else you can do the docs are welcome, though really, the best strategy would be to get spark releases built with that hadoop-cloud profile, which is intended to give you all the dependencies you need, and none of the ones you don't. > Clarity on Spark compatibility with hadoop-aws > -- > > Key:
[jira] [Comment Edited] (HADOOP-15559) Clarity on Spark compatibility with hadoop-aws
[ https://issues.apache.org/jira/browse/HADOOP-15559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523741#comment-16523741 ] Steve Loughran edited comment on HADOOP-15559 at 6/26/18 1:51 PM: -- # We feel your pain. Getting everything synced up is hard, especially when Spark itself bumps up some dependencies incompatibly (SPARK-22919) # the latest docs on this topic [are here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md]; as they say {quote}Critical: Do not attempt to "drop in" a newer version of the AWS SDK than that which the Hadoop version was built with Whatever problem you have, changing the AWS SDK version will not fix things, only change the stack traces you see. {quote} {quote}Similarly, don't try and mix a hadoop-aws JAR from one Hadoop release with that of any other. The JAR must be in sync with hadoop-common and some other Hadoop JARs. {quote} {quote}Randomly changing hadoop- and aws- JARs in the hope of making a problem "go away" or to gain access to a feature you want, will not lead to the outcome you desire. {quote} We also point people at [mvnepo|http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws] for the normative list of hadoop-aws - aws JAR mapping. Getting an AWS SDK and Hadoop AWS binding together is not easy, and if you do try it, you are on your own, with nothing but JIRAs related to "upgrade AWS SDK" to act as a cue. Where life is hard is that unless you build spark with the -Phadoop-cloud profile, you don't get things all lined up. It is what it is for. Regarding your specific issue, unless you are using a release of spark with that cloud profile, you have to do it by hand. * Get the exact matching hadoop-aws JAR as hadoop-common. Same for hadoop-auth, hadoop-aws. You cannot mix them. * get the matching aws SDK JAR(s), using mvnrepo as your guide. * And jackson, obviously. FWIW Hadoop 2.9+ has moved to the shaded AWS SDK JAR to avoid a lot of this pain. * If you want to use Hadoop 2.8, unless your spark distribution has reverted SPARK-22919, downgrade the httpclient libraries (see the PR there for what changed) Returning you complaint. anything else you can do the docs are welcome, though really, the best strategy would be to get spark releases built with that hadoop-cloud profile, which is intended to give you all the dependencies you need, and none of the ones you don't. was (Author: ste...@apache.org): # We feel your pain. Getting everything synced up is hard, especially when Spark itself bumps up some dependencies incompatibly (SPARK-22919) # the latest docs on this topic [are here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md]; as they say {quote}Critical: Do not attempt to "drop in" a newer version of the AWS SDK than that which the Hadoop version was built with Whatever problem you have, changing the AWS SDK version will not fix things, only change the stack traces you see. {quote} {quote}Similarly, don't try and mix a hadoop-aws JAR from one Hadoop release with that of any other. The JAR must be in sync with hadoop-common and some other Hadoop JARs. {quote} {quote}Randomly changing hadoop- and aws- JARs in the hope of making a problem "go away" or to gain access to a feature you want, will not lead to the outcome you desire. {quote} We also point people at [mvnepo|http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws] for the normative list of hadoop-aws - aws JAR mapping. Getting an AWS SDK and Hadoop AWS binding together is not easy, and if you do try it, you are on your own, with nothing but JIRAs related to "upgrade AWS SDK" to act as a cue. Where life is hard is that unless you build spark with the -Phadoop-cloud profile, you don't get things all lined up. It is what it is for. Regarding your specific issue, unless you are using a release of spark with that cloud profile, you have to do it by hand. * Get the exact matching hadoop-aws JAR as hadoop-common. Same for hadoop-auth, hadoop-aws. You cannot mix them. * get the matching aws SDK JAR(s), using mvnrepo as your guide. * And jackson, obviously. FWIW Hadoop 2.9+ has moved to the shaded AWS SDK JAR to avoid a lot of this pain. * If you want to use Hadoop 2.8, unless your spark distribution has reverted SPARK-22919, downgrade the httpclient libraries (see the PR there for what changed. We just reverted that patch on the basis that more people want S3A than stocator) Returning you complaint. anything else you can do the docs are welcome, though really, the best strategy would be to get spark releases built with that hadoop-cloud profile, which is intended to give you all the dependencies you need, and none of the ones you don't. > Clarity on Spark compatibility with
[jira] [Commented] (HADOOP-15559) Clarity on Spark compatibility with hadoop-aws
[ https://issues.apache.org/jira/browse/HADOOP-15559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523741#comment-16523741 ] Steve Loughran commented on HADOOP-15559: - # We feel your pain. Getting everything synced up is hard, especially when Spark itself bumps up some dependencies incompatibly (SPARK-22919) # the latest docs on this topic [are here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md]; as they say {quote}Critical: Do not attempt to "drop in" a newer version of the AWS SDK than that which the Hadoop version was built with Whatever problem you have, changing the AWS SDK version will not fix things, only change the stack traces you see. {quote} {quote}Similarly, don't try and mix a hadoop-aws JAR from one Hadoop release with that of any other. The JAR must be in sync with hadoop-common and some other Hadoop JARs. {quote} {quote}Randomly changing hadoop- and aws- JARs in the hope of making a problem "go away" or to gain access to a feature you want, will not lead to the outcome you desire. {quote} We also point people at [mvnepo|http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws] for the normative list of hadoop-aws - aws JAR mapping. Getting an AWS SDK and Hadoop AWS binding together is not easy, and if you do try it, you are on your own, with nothing but JIRAs related to "upgrade AWS SDK" to act as a cue. Where life is hard is that unless you build spark with the -Phadoop-cloud profile, you don't get things all lined up. It is what it is for. Regarding your specific issue, unless you are using a release of spark with that cloud profile, you have to do it by hand. * Get the exact matching hadoop-aws JAR as hadoop-common. Same for hadoop-auth, hadoop-aws. You cannot mix them. * get the matching aws SDK JAR(s), using mvnrepo as your guide. * And jackson, obviously. FWIW Hadoop 2.9+ has moved to the shaded AWS SDK JAR to avoid a lot of this pain. * If you want to use Hadoop 2.8, unless your spark distribution has reverted SPARK-22919, downgrade the httpclient libraries (see the PR there for what changed. We just reverted that patch on the basis that more people want S3A than stocator) Returning you complaint. anything else you can do the docs are welcome, though really, the best strategy would be to get spark releases built with that hadoop-cloud profile, which is intended to give you all the dependencies you need, and none of the ones you don't. > Clarity on Spark compatibility with hadoop-aws > -- > > Key: HADOOP-15559 > URL: https://issues.apache.org/jira/browse/HADOOP-15559 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation, fs/s3 >Reporter: Nicholas Chammas >Priority: Minor > > I'm the maintainer of [Flintrock|https://github.com/nchammas/flintrock], a > command-line tool for launching Apache Spark clusters on AWS. One of the > things I try to do for my users is make it straightforward to use Spark with > {{s3a://}}. I do this by recommending that users start Spark with the > {{hadoop-aws}} package. > For example: > {code:java} > pyspark --packages "org.apache.hadoop:hadoop-aws:2.8.4" > {code} > I'm struggling, however, to understand what versions of {{hadoop-aws}} should > work with what versions of Spark. > Spark releases are [built against Hadoop > 2.7|http://archive.apache.org/dist/spark/spark-2.3.1/]. At the same time, > I've been told that I should be able to use newer versions of Hadoop and > Hadoop libraries with Spark, so for example, running Spark built against > Hadoop 2.7 alongside HDFS 2.8 should work, and there is [no need to build > Spark explicitly against Hadoop > 2.8|http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Spark-2-3-1-RC4-tp24087p24092.html]. > I'm having trouble translating this mental model into recommendations for how > to pair Spark with {{hadoop-aws}}. > For example, Spark 2.3.1 built against Hadoop 2.7 works with > {{hadoop-aws:2.7.6}} but not with {{hadoop-aws:2.8.4}}. Trying the latter > yields the following error when I try to access files via {{s3a://}}. > {code:java} > py4j.protocol.Py4JJavaError: An error occurred while calling o35.text. > : java.lang.IllegalAccessError: tried to access method > org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V > from class org.apache.hadoop.fs.s3a.S3AInstrumentation > at > org.apache.hadoop.fs.s3a.S3AInstrumentation.streamCounter(S3AInstrumentation.java:194) > at > org.apache.hadoop.fs.s3a.S3AInstrumentation.streamCounter(S3AInstrumentation.java:216) > at > org.apache.hadoop.fs.s3a.S3AInstrumentation.(S3AInstrumentation.java:139) > at
[jira] [Assigned] (HADOOP-15349) S3Guard DDB retryBackoff to be more informative on limits exceeded
[ https://issues.apache.org/jira/browse/HADOOP-15349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota reassigned HADOOP-15349: --- Assignee: Gabor Bota > S3Guard DDB retryBackoff to be more informative on limits exceeded > -- > > Key: HADOOP-15349 > URL: https://issues.apache.org/jira/browse/HADOOP-15349 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Steve Loughran >Assignee: Gabor Bota >Priority: Major > Attachments: failure.log > > > When S3Guard can't update the DB and so throws an IOE after the retry limit > is exceeded, it's not at all informative. Improve logging & exception -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-14592) ITestS3ATemporaryCredentials to cover all ddb metastore ops with session credentials
[ https://issues.apache.org/jira/browse/HADOOP-14592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota reassigned HADOOP-14592: --- Assignee: Gabor Bota > ITestS3ATemporaryCredentials to cover all ddb metastore ops with session > credentials > > > Key: HADOOP-14592 > URL: https://issues.apache.org/jira/browse/HADOOP-14592 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.0.0-beta1 >Reporter: Steve Loughran >Assignee: Gabor Bota >Priority: Minor > > {{ITestS3ATemporaryCredentials}} tests a couple of operations with temp > credentials, but for completeness it should perform all operations which will > access dynamo DB, so verifying that temporary credentials work everywhere. > I know this is implicit from anyone running the tests in an EC2 container and > picking up the IAM details, so I'm not expecting the tests to fail —but they > should be there -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-15215) s3guard set-capacity command to fail on read/write of 0
[ https://issues.apache.org/jira/browse/HADOOP-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota reassigned HADOOP-15215: --- Assignee: Gabor Bota > s3guard set-capacity command to fail on read/write of 0 > --- > > Key: HADOOP-15215 > URL: https://issues.apache.org/jira/browse/HADOOP-15215 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Gabor Bota >Priority: Minor > > the command {{hadoop s3guard set-capacity -read 0 s3a://bucket}} will get > all the way to the AWS SDK before it's rejected; if you pass in a value of -1 > we fail fast. > The CLI check should really be failing on <= 0, not < 0. > You still get a stack trace, so it's not that important. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15558) Implementation of Clay Codes plugin (Coupled Layer MSR codes)
[ https://issues.apache.org/jira/browse/HADOOP-15558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523573#comment-16523573 ] Shreya Gupta commented on HADOOP-15558: --- Hi. Meanwhile one can have a look at these [slides|https://www.usenix.org/sites/default/files/conference/protected-files/fast18_slides_vajha.pdf] to better understand how Clay Codes work. > Implementation of Clay Codes plugin (Coupled Layer MSR codes) > -- > > Key: HADOOP-15558 > URL: https://issues.apache.org/jira/browse/HADOOP-15558 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Chaitanya Mukka >Assignee: Chaitanya Mukka >Priority: Major > > [Clay Codes|https://www.usenix.org/conference/fast18/presentation/vajha] are > new erasure codes developed as a research project at Codes and Signal Design > Lab, IISc Bangalore. A particular Clay code, with storage overhead 1.25x, has > been shown to reduce repair network traffic, disk read and repair times by > factors of 2.9, 3.4 and 3 respectively compared to the RS codes with the same > parameters. > This Jira aims to introduce Clay Codes to HDFS-EC as one of the pluggable > erasure codec. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15561) Property Azure Account Key Provider Key requires blob.core.windows.net
Santiago Velasco created HADOOP-15561: - Summary: Property Azure Account Key Provider Key requires blob.core.windows.net Key: HADOOP-15561 URL: https://issues.apache.org/jira/browse/HADOOP-15561 Project: Hadoop Common Issue Type: Improvement Components: fs/azure Affects Versions: 3.1.0 Reporter: Santiago Velasco The documentation for Hadoop Azure Support: Azure Blob Storage [http://hadoop.apache.org/docs/r3.1.0/hadoop-azure/index.html] The properties under Protecting the Azure Credentials for WASB within an Encrypted File The property fs.azure.account.keyprovider.youraccount requires _.blob.core.windows.net_ Documentation reads: {code:java} fs.azure.account.keyprovider.youraccount org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider {code} Should read: {code:java} fs.azure.account.keyprovider.youraccount.blob.core.windows.net org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15560) ABFS: removed dependency injection and unnecessary dependencies
[ https://issues.apache.org/jira/browse/HADOOP-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523288#comment-16523288 ] Da Zhou commented on HADOOP-15560: -- all azure blob filesystem tests passed against my azure test account in east US: {noformat} [INFO] --- maven-surefire-plugin:2.21.0:test (default-test) @ hadoop-azure --- [INFO] [INFO] --- [INFO] T E S T S [INFO] --- [INFO] Running org.apache.hadoop.fs.azure.TestWasbFsck [INFO] Running org.apache.hadoop.fs.azure.TestBlobMetadata [INFO] Running org.apache.hadoop.fs.azure.TestOutOfBandAzureBlobOperations [INFO] Running org.apache.hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked [INFO] Running org.apache.hadoop.fs.azure.TestShellDecryptionKeyProvider [INFO] Running org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked [INFO] Running org.apache.hadoop.fs.azure.TestNativeAzureFileSystemUploadLogic [INFO] Running org.apache.hadoop.fs.azure.TestClientThrottlingAnalyzer [WARNING] Tests run: 3, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 0.947 s - in org.apache.hadoop.fs.azure.TestNativeAzureFileSystemUploadLogic [WARNING] Tests run: 2, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 1.471 s - in org.apache.hadoop.fs.azure.TestShellDecryptionKeyProvider [INFO] Running org.apache.hadoop.fs.azure.TestNativeAzureFileSystemConcurrency [INFO] Running org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked [WARNING] Tests run: 2, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 7.255 s - in org.apache.hadoop.fs.azure.TestWasbFsck [INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.674 s - in org.apache.hadoop.fs.azure.TestOutOfBandAzureBlobOperations [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.719 s - in org.apache.hadoop.fs.azure.TestBlobMetadata [INFO] Running org.apache.hadoop.fs.azure.metrics.TestNativeAzureFileSystemMetricsSystem [INFO] Running org.apache.hadoop.fs.azure.metrics.TestBandwidthGaugeUpdater [INFO] Running org.apache.hadoop.fs.azure.TestNativeAzureFileSystemAuthorization [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.555 s - in org.apache.hadoop.fs.azure.metrics.TestBandwidthGaugeUpdater [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.393 s - in org.apache.hadoop.fs.azure.TestNativeAzureFileSystemConcurrency [INFO] Running org.apache.hadoop.fs.azure.TestBlobOperationDescriptor [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.55 s - in org.apache.hadoop.fs.azure.metrics.TestNativeAzureFileSystemMetricsSystem [INFO] Running org.apache.hadoop.fs.azure.TestNativeAzureFileSystemFileNameCheck [INFO] Tests run: 50, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.742 s - in org.apache.hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked [WARNING] Tests run: 43, Failures: 0, Errors: 0, Skipped: 5, Time elapsed: 15.459 s - in org.apache.hadoop.fs.azure.TestNativeAzureFileSystemContractMocked [WARNING] Tests run: 59, Failures: 0, Errors: 0, Skipped: 59, Time elapsed: 8.214 s - in org.apache.hadoop.fs.azure.TestNativeAzureFileSystemAuthorization [INFO] Running org.apache.hadoop.fs.azure.TestNativeAzureFileSystemBlockCompaction [WARNING] Tests run: 4, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 2.756 s - in org.apache.hadoop.fs.azure.TestBlobOperationDescriptor [INFO] Running org.apache.hadoop.fs.azurebfs.diagnostics.TestConfigurationValidators [INFO] Running org.apache.hadoop.fs.azurebfs.utils.TestUriUtils [INFO] Running org.apache.hadoop.fs.azurebfs.services.TestAbfsConfigurationFieldsValidation [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.389 s - in org.apache.hadoop.fs.azurebfs.utils.TestUriUtils [WARNING] Tests run: 2, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 2.438 s - in org.apache.hadoop.fs.azure.TestNativeAzureFileSystemBlockCompaction [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.226 s - in org.apache.hadoop.fs.azurebfs.services.TestAbfsConfigurationFieldsValidation [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.09 s - in org.apache.hadoop.fs.azure.TestNativeAzureFileSystemFileNameCheck [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.614 s - in org.apache.hadoop.fs.azurebfs.diagnostics.TestConfigurationValidators [INFO] Tests run: 46, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 33.034 s - in org.apache.hadoop.fs.azure.TestNativeAzureFileSystemMocked [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 37.334 s - in org.apache.hadoop.fs.azure.TestClientThrottlingAnalyzer [INFO] [INFO] Results: [INFO] [WARNING] Tests run: 257, Failures: 0, Errors: 0, Skipped: 76 [INFO] [INFO] [INFO] --- maven-surefire-plugin:2.21.0:test
[jira] [Updated] (HADOOP-15560) ABFS: removed dependency injection and unnecessary dependencies
[ https://issues.apache.org/jira/browse/HADOOP-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Da Zhou updated HADOOP-15560: - Attachment: HADOOP-15407-HADOOP-15407-009.patch > ABFS: removed dependency injection and unnecessary dependencies > --- > > Key: HADOOP-15560 > URL: https://issues.apache.org/jira/browse/HADOOP-15560 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Da Zhou >Assignee: Da Zhou >Priority: Major > Attachments: HADOOP-15407-HADOOP-15407-009.patch > > > # Removed dependency injection and unnecessary dependencies. > # Added tool to clean up test containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15560) ABFS: removed dependency injection and unnecessary dependencies
Da Zhou created HADOOP-15560: Summary: ABFS: removed dependency injection and unnecessary dependencies Key: HADOOP-15560 URL: https://issues.apache.org/jira/browse/HADOOP-15560 Project: Hadoop Common Issue Type: Sub-task Reporter: Da Zhou Assignee: Da Zhou # Removed dependency injection and unnecessary dependencies. # Added tool to clean up test containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org