[jira] [Updated] (HADOOP-13868) New defaults for S3A multi-part configuration
[ https://issues.apache.org/jira/browse/HADOOP-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-13868: --- Resolution: Fixed Status: Resolved (was: Patch Available) > New defaults for S3A multi-part configuration > - > > Key: HADOOP-13868 > URL: https://issues.apache.org/jira/browse/HADOOP-13868 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 2.7.0, 3.0.0-alpha1 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-13868.001.patch, HADOOP-13868.002.patch, > optimizing-multipart-s3a.sh > > > I've been looking at a big performance regression when writing to S3 from > Spark that appears to have been introduced with HADOOP-12891. > In the Amazon SDK, the default threshold for multi-part copies is 320x the > threshold for multi-part uploads (and the block size is 20x bigger), so I > don't think it's necessarily wise for us to have them be the same. > I did some quick tests and it seems to me the sweet spot when multi-part > copies start being faster is around 512MB. It wasn't as significant, but > using 104857600 (Amazon's default) for the blocksize was also slightly better. > I propose we do the following, although they're independent decisions: > (1) Split the configuration. Ideally, I'd like to have > fs.s3a.multipart.copy.threshold and fs.s3a.multipart.upload.threshold (and > corresponding properties for the block size). But then there's the question > of what to do with the existing fs.s3a.multipart.* properties. Deprecation? > Leave it as a short-hand for configuring both (that's overridden by the more > specific properties?). > (2) Consider increasing the default values. In my tests, 256 MB seemed to be > where multipart uploads came into their own, and 512 MB was where multipart > copies started outperforming the alternative. Would be interested to hear > what other people have seen. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13868) New defaults for S3A multi-part configuration
[ https://issues.apache.org/jira/browse/HADOOP-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888395#comment-16888395 ] Sean Mackrory commented on HADOOP-13868: Bit-rot after only 2 1/2 years? Imagine that! Actually the only part that doesn't apply cleanly is the documentation, and that's just because it's looking 100 lines away from where it should. Resubmitted as a pull request to verify a clean Yetus run, but as the patch is virtually identical I'll assume your +1 still applies unless I hear otherwise. > New defaults for S3A multi-part configuration > - > > Key: HADOOP-13868 > URL: https://issues.apache.org/jira/browse/HADOOP-13868 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 2.7.0, 3.0.0-alpha1 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-13868.001.patch, HADOOP-13868.002.patch, > optimizing-multipart-s3a.sh > > > I've been looking at a big performance regression when writing to S3 from > Spark that appears to have been introduced with HADOOP-12891. > In the Amazon SDK, the default threshold for multi-part copies is 320x the > threshold for multi-part uploads (and the block size is 20x bigger), so I > don't think it's necessarily wise for us to have them be the same. > I did some quick tests and it seems to me the sweet spot when multi-part > copies start being faster is around 512MB. It wasn't as significant, but > using 104857600 (Amazon's default) for the blocksize was also slightly better. > I propose we do the following, although they're independent decisions: > (1) Split the configuration. Ideally, I'd like to have > fs.s3a.multipart.copy.threshold and fs.s3a.multipart.upload.threshold (and > corresponding properties for the block size). But then there's the question > of what to do with the existing fs.s3a.multipart.* properties. Deprecation? > Leave it as a short-hand for configuring both (that's overridden by the more > specific properties?). > (2) Consider increasing the default values. In my tests, 256 MB seemed to be > where multipart uploads came into their own, and 512 MB was where multipart > copies started outperforming the alternative. Would be interested to hear > what other people have seen. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882398#comment-16882398 ] Sean Mackrory commented on HADOOP-15729: Oh it was an obsolete import. Pull request created: https://github.com/apache/hadoop/pull/1075 > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch, HADOOP-15729.002.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15729: --- Attachment: (was: HADOOP-15729.002.patch) > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch, HADOOP-15729.002.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15729: --- Attachment: HADOOP-15729.002.patch > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch, HADOOP-15729.002.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15729: --- Status: Patch Available (was: Open) Resubmitting as checkstyle output has expired. > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch, HADOOP-15729.002.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15729: --- Status: Open (was: Patch Available) > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch, HADOOP-15729.002.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13980) S3Guard CLI: Add fsck check command
[ https://issues.apache.org/jira/browse/HADOOP-13980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880424#comment-16880424 ] Sean Mackrory commented on HADOOP-13980: {quote}Export metadatastore and S3 bucket hierarchy{quote} Is this different in some way from "export scan results in human readable format". Thought about maybe having a machine-readable export that we could import if that might help with supportability. I've personally never seen a support issue that it would've helped with, but just something to think about... {quote}Implement the fixing mechanism{quote} We can probably break this into more subtasks. If would be best if the implementation had a sequence of specific "fixers" to address specific discrepancies. "fixMissingParents", "fixOutOfDateEntries", etc. > S3Guard CLI: Add fsck check command > --- > > Key: HADOOP-13980 > URL: https://issues.apache.org/jira/browse/HADOOP-13980 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0-beta1 >Reporter: Aaron Fabbri >Assignee: Gabor Bota >Priority: Major > > As discussed in HADOOP-13650, we want to add an S3Guard CLI command which > compares S3 with MetadataStore, and returns a failure status if any > invariants are violated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16396) Allow authoritative mode on a subdirectory
[ https://issues.apache.org/jira/browse/HADOOP-16396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-16396: --- Resolution: Fixed Status: Resolved (was: Patch Available) Thanks, [~gabor.bota]. I neglected to fix the whitespace nits you pointed out before merging. I filed HADOOP-16409 for that and another follow-up tweak. > Allow authoritative mode on a subdirectory > -- > > Key: HADOOP-16396 > URL: https://issues.apache.org/jira/browse/HADOOP-16396 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-16396.001.patch, HADOOP-16396.002.patch, > HADOOP-16396.003.patch > > > Let's allow authoritative mode to be applied only to a subset of a bucket. > This is coming primarily from a Hive warehousing use-case where Hive-managed > tables can benefit from query planning, but can't speak for the rest of the > bucket. This should be limited in scope and is not a general attempt to allow > configuration on a per-path basis, as configuration is currently done on a > per-process of a per-bucket basis. > I propose a new property (we could overload > fs.s3a.metadatastore.authoritative, but that seems likely to cause confusion > somewhere). A string would be allowed that would then be qualified in the > context of the FileSystem, and used to check if it is a prefix for a given > path. If it is, we act as though authoritative mode is enabled. If not, we > revert to the existing behavior of fs.s3a.metadatastore.authoritative (which > in practice will probably be false, the default, if the new property is in > use). > Let's be clear about a few things: > * Currently authoritative mode only short-cuts the process to avoid a > round-trip to S3 if we know it is safe to do so. This means that even when > authoritative mode is enabled for a bucket, if the metadata store does not > have a complete (or "authoritative") current listing cached, authoritative > mode still has no effect. This will still apply. > * This will only apply to getFileStatus and listStatus, and internal calls to > their internal counterparts. No other API is currently using authoritative > mode to change behavior. > * This will only apply to getFileStatus and listStatus calls INSIDE the > configured prefix. If there is a recursvie listing on the parent of the > configured prefix, no change in behavior will be observed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16409) Allow authoritative mode on non-qualified paths
Sean Mackrory created HADOOP-16409: -- Summary: Allow authoritative mode on non-qualified paths Key: HADOOP-16409 URL: https://issues.apache.org/jira/browse/HADOOP-16409 Project: Hadoop Common Issue Type: Improvement Reporter: Sean Mackrory Assignee: Sean Mackrory fs.s3a.authoritative.path currently requires a qualified URI (e.g. s3a://bucket/path) which is how I see this being used most immediately, but it also make sense for someone to just be able to configure /path, if all of their buckets follow that pattern, or if they're providing configuration already in a bucket-specific context (e.g. job-level configs, etc.) Just need to qualify whatever is passed in to allowAuthoritative to make that work. Also, in HADOOP-16396 Gabor pointed out a few whitepace nits that I neglected to fix before merging. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16396) Allow authoritative mode on a subdirectory
[ https://issues.apache.org/jira/browse/HADOOP-16396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876283#comment-16876283 ] Sean Mackrory commented on HADOOP-16396: Submitted an iteration based on your feedback via PR. I moved allowAuthoritative into S3Guard, as I agree simplifying S3AFS.java is worthwhile. It's a bit messy because it needs 4 long parameter names or adding getters to S3AFS anyway. I deduplicated a couple of the calls. I'd feel better if we would rename "allowAuthoritativeMetadataStore" and "allowAuthoritativePaths" to "authMode" and "authModePaths". So far, "auth mode" is just a short-hand being thrown around - any objection to formalizing that in code? Renamed unguardedFS to rawFS. I'd like to keep "fullyAuthFS" to distinguish between other "guarded" FS that have other varying settings for authoritative paths, etc. I have a rather unique need for the other get*FS functions that take a directory as an argument, I think, so I'd rather not factor those out. I removed createNonAuthFS since I was indeed no longer using it. > Allow authoritative mode on a subdirectory > -- > > Key: HADOOP-16396 > URL: https://issues.apache.org/jira/browse/HADOOP-16396 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-16396.001.patch, HADOOP-16396.002.patch, > HADOOP-16396.003.patch > > > Let's allow authoritative mode to be applied only to a subset of a bucket. > This is coming primarily from a Hive warehousing use-case where Hive-managed > tables can benefit from query planning, but can't speak for the rest of the > bucket. This should be limited in scope and is not a general attempt to allow > configuration on a per-path basis, as configuration is currently done on a > per-process of a per-bucket basis. > I propose a new property (we could overload > fs.s3a.metadatastore.authoritative, but that seems likely to cause confusion > somewhere). A string would be allowed that would then be qualified in the > context of the FileSystem, and used to check if it is a prefix for a given > path. If it is, we act as though authoritative mode is enabled. If not, we > revert to the existing behavior of fs.s3a.metadatastore.authoritative (which > in practice will probably be false, the default, if the new property is in > use). > Let's be clear about a few things: > * Currently authoritative mode only short-cuts the process to avoid a > round-trip to S3 if we know it is safe to do so. This means that even when > authoritative mode is enabled for a bucket, if the metadata store does not > have a complete (or "authoritative") current listing cached, authoritative > mode still has no effect. This will still apply. > * This will only apply to getFileStatus and listStatus, and internal calls to > their internal counterparts. No other API is currently using authoritative > mode to change behavior. > * This will only apply to getFileStatus and listStatus calls INSIDE the > configured prefix. If there is a recursvie listing on the parent of the > configured prefix, no change in behavior will be observed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16396) Allow authoritative mode on a subdirectory
[ https://issues.apache.org/jira/browse/HADOOP-16396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875275#comment-16875275 ] Sean Mackrory commented on HADOOP-16396: Added a couple of other things: * Ability to specify multiple authoritative paths, comma-delimited (kind of assumes you don't have commas in your paths, which I wish was a safe assumption, but I've seen weirded punctuation in paths before - not sure if it's worth doing more about that). * Ensures specified paths are treated as directories and not prefixes (e.g. If you make /auth authoritative, /auth-plus-a-suffix is not included in that). One downside is that ITestS3GuardOutOfBandOperations had some functions for checking that listings did and didn't contain specific entries, which I refactored into S3ATestUtils. Part of this include more generic messages in failures. I hope [~gabor.bota] is okay with that. This also changes .toString(), so IMO it warrants a Release Note when it goes in. > Allow authoritative mode on a subdirectory > -- > > Key: HADOOP-16396 > URL: https://issues.apache.org/jira/browse/HADOOP-16396 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-16396.001.patch, HADOOP-16396.002.patch, > HADOOP-16396.003.patch > > > Let's allow authoritative mode to be applied only to a subset of a bucket. > This is coming primarily from a Hive warehousing use-case where Hive-managed > tables can benefit from query planning, but can't speak for the rest of the > bucket. This should be limited in scope and is not a general attempt to allow > configuration on a per-path basis, as configuration is currently done on a > per-process of a per-bucket basis. > I propose a new property (we could overload > fs.s3a.metadatastore.authoritative, but that seems likely to cause confusion > somewhere). A string would be allowed that would then be qualified in the > context of the FileSystem, and used to check if it is a prefix for a given > path. If it is, we act as though authoritative mode is enabled. If not, we > revert to the existing behavior of fs.s3a.metadatastore.authoritative (which > in practice will probably be false, the default, if the new property is in > use). > Let's be clear about a few things: > * Currently authoritative mode only short-cuts the process to avoid a > round-trip to S3 if we know it is safe to do so. This means that even when > authoritative mode is enabled for a bucket, if the metadata store does not > have a complete (or "authoritative") current listing cached, authoritative > mode still has no effect. This will still apply. > * This will only apply to getFileStatus and listStatus, and internal calls to > their internal counterparts. No other API is currently using authoritative > mode to change behavior. > * This will only apply to getFileStatus and listStatus calls INSIDE the > configured prefix. If there is a recursvie listing on the parent of the > configured prefix, no change in behavior will be observed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16396) Allow authoritative mode on a subdirectory
[ https://issues.apache.org/jira/browse/HADOOP-16396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-16396: --- Attachment: HADOOP-16396.003.patch > Allow authoritative mode on a subdirectory > -- > > Key: HADOOP-16396 > URL: https://issues.apache.org/jira/browse/HADOOP-16396 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-16396.001.patch, HADOOP-16396.002.patch, > HADOOP-16396.003.patch > > > Let's allow authoritative mode to be applied only to a subset of a bucket. > This is coming primarily from a Hive warehousing use-case where Hive-managed > tables can benefit from query planning, but can't speak for the rest of the > bucket. This should be limited in scope and is not a general attempt to allow > configuration on a per-path basis, as configuration is currently done on a > per-process of a per-bucket basis. > I propose a new property (we could overload > fs.s3a.metadatastore.authoritative, but that seems likely to cause confusion > somewhere). A string would be allowed that would then be qualified in the > context of the FileSystem, and used to check if it is a prefix for a given > path. If it is, we act as though authoritative mode is enabled. If not, we > revert to the existing behavior of fs.s3a.metadatastore.authoritative (which > in practice will probably be false, the default, if the new property is in > use). > Let's be clear about a few things: > * Currently authoritative mode only short-cuts the process to avoid a > round-trip to S3 if we know it is safe to do so. This means that even when > authoritative mode is enabled for a bucket, if the metadata store does not > have a complete (or "authoritative") current listing cached, authoritative > mode still has no effect. This will still apply. > * This will only apply to getFileStatus and listStatus, and internal calls to > their internal counterparts. No other API is currently using authoritative > mode to change behavior. > * This will only apply to getFileStatus and listStatus calls INSIDE the > configured prefix. If there is a recursvie listing on the parent of the > configured prefix, no change in behavior will be observed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16396) Allow authoritative mode on a subdirectory
[ https://issues.apache.org/jira/browse/HADOOP-16396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-16396: --- Status: Patch Available (was: Open) > Allow authoritative mode on a subdirectory > -- > > Key: HADOOP-16396 > URL: https://issues.apache.org/jira/browse/HADOOP-16396 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-16396.001.patch, HADOOP-16396.002.patch, > HADOOP-16396.003.patch > > > Let's allow authoritative mode to be applied only to a subset of a bucket. > This is coming primarily from a Hive warehousing use-case where Hive-managed > tables can benefit from query planning, but can't speak for the rest of the > bucket. This should be limited in scope and is not a general attempt to allow > configuration on a per-path basis, as configuration is currently done on a > per-process of a per-bucket basis. > I propose a new property (we could overload > fs.s3a.metadatastore.authoritative, but that seems likely to cause confusion > somewhere). A string would be allowed that would then be qualified in the > context of the FileSystem, and used to check if it is a prefix for a given > path. If it is, we act as though authoritative mode is enabled. If not, we > revert to the existing behavior of fs.s3a.metadatastore.authoritative (which > in practice will probably be false, the default, if the new property is in > use). > Let's be clear about a few things: > * Currently authoritative mode only short-cuts the process to avoid a > round-trip to S3 if we know it is safe to do so. This means that even when > authoritative mode is enabled for a bucket, if the metadata store does not > have a complete (or "authoritative") current listing cached, authoritative > mode still has no effect. This will still apply. > * This will only apply to getFileStatus and listStatus, and internal calls to > their internal counterparts. No other API is currently using authoritative > mode to change behavior. > * This will only apply to getFileStatus and listStatus calls INSIDE the > configured prefix. If there is a recursvie listing on the parent of the > configured prefix, no change in behavior will be observed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16396) Allow authoritative mode on a subdirectory
[ https://issues.apache.org/jira/browse/HADOOP-16396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875113#comment-16875113 ] Sean Mackrory commented on HADOOP-16396: I see what I'm missing: listStatus does write-backs. listFiles does not. This makes sense because listFiles doesn't otherwise require us to pull enough data to populate complete rows in the metadata store. And it's okay because I'm fairly certain the Hive warehousing use-case mentioned above is primarily going to be using listStatus for query planning anyway. As a side note, maybe listFiles should somehow be enable to populate partial rows that can fill in for future listFiles-like calls, even though they would be insufficient for future listStatus-like calls. Just a thought. In any case - after switching from listFiles to listStatus in my tests while trying to trigger a write-back, my tests now pass. > Allow authoritative mode on a subdirectory > -- > > Key: HADOOP-16396 > URL: https://issues.apache.org/jira/browse/HADOOP-16396 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-16396.001.patch, HADOOP-16396.002.patch > > > Let's allow authoritative mode to be applied only to a subset of a bucket. > This is coming primarily from a Hive warehousing use-case where Hive-managed > tables can benefit from query planning, but can't speak for the rest of the > bucket. This should be limited in scope and is not a general attempt to allow > configuration on a per-path basis, as configuration is currently done on a > per-process of a per-bucket basis. > I propose a new property (we could overload > fs.s3a.metadatastore.authoritative, but that seems likely to cause confusion > somewhere). A string would be allowed that would then be qualified in the > context of the FileSystem, and used to check if it is a prefix for a given > path. If it is, we act as though authoritative mode is enabled. If not, we > revert to the existing behavior of fs.s3a.metadatastore.authoritative (which > in practice will probably be false, the default, if the new property is in > use). > Let's be clear about a few things: > * Currently authoritative mode only short-cuts the process to avoid a > round-trip to S3 if we know it is safe to do so. This means that even when > authoritative mode is enabled for a bucket, if the metadata store does not > have a complete (or "authoritative") current listing cached, authoritative > mode still has no effect. This will still apply. > * This will only apply to getFileStatus and listStatus, and internal calls to > their internal counterparts. No other API is currently using authoritative > mode to change behavior. > * This will only apply to getFileStatus and listStatus calls INSIDE the > configured prefix. If there is a recursvie listing on the parent of the > configured prefix, no change in behavior will be observed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16396) Allow authoritative mode on a subdirectory
[ https://issues.apache.org/jira/browse/HADOOP-16396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-16396: --- Attachment: HADOOP-16396.002.patch > Allow authoritative mode on a subdirectory > -- > > Key: HADOOP-16396 > URL: https://issues.apache.org/jira/browse/HADOOP-16396 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-16396.001.patch, HADOOP-16396.002.patch > > > Let's allow authoritative mode to be applied only to a subset of a bucket. > This is coming primarily from a Hive warehousing use-case where Hive-managed > tables can benefit from query planning, but can't speak for the rest of the > bucket. This should be limited in scope and is not a general attempt to allow > configuration on a per-path basis, as configuration is currently done on a > per-process of a per-bucket basis. > I propose a new property (we could overload > fs.s3a.metadatastore.authoritative, but that seems likely to cause confusion > somewhere). A string would be allowed that would then be qualified in the > context of the FileSystem, and used to check if it is a prefix for a given > path. If it is, we act as though authoritative mode is enabled. If not, we > revert to the existing behavior of fs.s3a.metadatastore.authoritative (which > in practice will probably be false, the default, if the new property is in > use). > Let's be clear about a few things: > * Currently authoritative mode only short-cuts the process to avoid a > round-trip to S3 if we know it is safe to do so. This means that even when > authoritative mode is enabled for a bucket, if the metadata store does not > have a complete (or "authoritative") current listing cached, authoritative > mode still has no effect. This will still apply. > * This will only apply to getFileStatus and listStatus, and internal calls to > their internal counterparts. No other API is currently using authoritative > mode to change behavior. > * This will only apply to getFileStatus and listStatus calls INSIDE the > configured prefix. If there is a recursvie listing on the parent of the > configured prefix, no change in behavior will be observed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16396) Allow authoritative mode on a subdirectory
[ https://issues.apache.org/jira/browse/HADOOP-16396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873692#comment-16873692 ] Sean Mackrory commented on HADOOP-16396: Attaching my current state, although I'm not done. My tests are failing because when I list a directory that we just listed a minute ago, it's still querying S3, even when authoritative mode should, in my understanding be kicking in. The problem would seem to be that the first listing doesn't perform a write-back, and sure-enough the metadata store never considers that directory listing authoritative. I ran all (not scale) the tests and traced to confirm that at no point do any of them write-back with authoritative=true. I thought we used to have logic in listings that would conditionally do a write back, and I assumed that recent work would have included flipping the authoritative bit. Am I missing something? [~gabor.bota] [~ste...@apache.org] > Allow authoritative mode on a subdirectory > -- > > Key: HADOOP-16396 > URL: https://issues.apache.org/jira/browse/HADOOP-16396 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-16396.001.patch > > > Let's allow authoritative mode to be applied only to a subset of a bucket. > This is coming primarily from a Hive warehousing use-case where Hive-managed > tables can benefit from query planning, but can't speak for the rest of the > bucket. This should be limited in scope and is not a general attempt to allow > configuration on a per-path basis, as configuration is currently done on a > per-process of a per-bucket basis. > I propose a new property (we could overload > fs.s3a.metadatastore.authoritative, but that seems likely to cause confusion > somewhere). A string would be allowed that would then be qualified in the > context of the FileSystem, and used to check if it is a prefix for a given > path. If it is, we act as though authoritative mode is enabled. If not, we > revert to the existing behavior of fs.s3a.metadatastore.authoritative (which > in practice will probably be false, the default, if the new property is in > use). > Let's be clear about a few things: > * Currently authoritative mode only short-cuts the process to avoid a > round-trip to S3 if we know it is safe to do so. This means that even when > authoritative mode is enabled for a bucket, if the metadata store does not > have a complete (or "authoritative") current listing cached, authoritative > mode still has no effect. This will still apply. > * This will only apply to getFileStatus and listStatus, and internal calls to > their internal counterparts. No other API is currently using authoritative > mode to change behavior. > * This will only apply to getFileStatus and listStatus calls INSIDE the > configured prefix. If there is a recursvie listing on the parent of the > configured prefix, no change in behavior will be observed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16396) Allow authoritative mode on a subdirectory
[ https://issues.apache.org/jira/browse/HADOOP-16396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-16396: --- Attachment: HADOOP-16396.001.patch > Allow authoritative mode on a subdirectory > -- > > Key: HADOOP-16396 > URL: https://issues.apache.org/jira/browse/HADOOP-16396 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-16396.001.patch > > > Let's allow authoritative mode to be applied only to a subset of a bucket. > This is coming primarily from a Hive warehousing use-case where Hive-managed > tables can benefit from query planning, but can't speak for the rest of the > bucket. This should be limited in scope and is not a general attempt to allow > configuration on a per-path basis, as configuration is currently done on a > per-process of a per-bucket basis. > I propose a new property (we could overload > fs.s3a.metadatastore.authoritative, but that seems likely to cause confusion > somewhere). A string would be allowed that would then be qualified in the > context of the FileSystem, and used to check if it is a prefix for a given > path. If it is, we act as though authoritative mode is enabled. If not, we > revert to the existing behavior of fs.s3a.metadatastore.authoritative (which > in practice will probably be false, the default, if the new property is in > use). > Let's be clear about a few things: > * Currently authoritative mode only short-cuts the process to avoid a > round-trip to S3 if we know it is safe to do so. This means that even when > authoritative mode is enabled for a bucket, if the metadata store does not > have a complete (or "authoritative") current listing cached, authoritative > mode still has no effect. This will still apply. > * This will only apply to getFileStatus and listStatus, and internal calls to > their internal counterparts. No other API is currently using authoritative > mode to change behavior. > * This will only apply to getFileStatus and listStatus calls INSIDE the > configured prefix. If there is a recursvie listing on the parent of the > configured prefix, no change in behavior will be observed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16396) Allow authoritative mode on a subdirectory
Sean Mackrory created HADOOP-16396: -- Summary: Allow authoritative mode on a subdirectory Key: HADOOP-16396 URL: https://issues.apache.org/jira/browse/HADOOP-16396 Project: Hadoop Common Issue Type: Improvement Components: fs/s3 Reporter: Sean Mackrory Assignee: Sean Mackrory Let's allow authoritative mode to be applied only to a subset of a bucket. This is coming primarily from a Hive warehousing use-case where Hive-managed tables can benefit from query planning, but can't speak for the rest of the bucket. This should be limited in scope and is not a general attempt to allow configuration on a per-path basis, as configuration is currently done on a per-process of a per-bucket basis. I propose a new property (we could overload fs.s3a.metadatastore.authoritative, but that seems likely to cause confusion somewhere). A string would be allowed that would then be qualified in the context of the FileSystem, and used to check if it is a prefix for a given path. If it is, we act as though authoritative mode is enabled. If not, we revert to the existing behavior of fs.s3a.metadatastore.authoritative (which in practice will probably be false, the default, if the new property is in use). Let's be clear about a few things: * Currently authoritative mode only short-cuts the process to avoid a round-trip to S3 if we know it is safe to do so. This means that even when authoritative mode is enabled for a bucket, if the metadata store does not have a complete (or "authoritative") current listing cached, authoritative mode still has no effect. This will still apply. * This will only apply to getFileStatus and listStatus, and internal calls to their internal counterparts. No other API is currently using authoritative mode to change behavior. * This will only apply to getFileStatus and listStatus calls INSIDE the configured prefix. If there is a recursvie listing on the parent of the configured prefix, no change in behavior will be observed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13551) hook up AwsSdkMetrics to hadoop metrics
[ https://issues.apache.org/jira/browse/HADOOP-13551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873409#comment-16873409 ] Sean Mackrory commented on HADOOP-13551: As an update, every time someone has asked about this, I've pointed out the JDK property that allows the SDK to just publish metrics to CloudWatch, and so far that's exceeded everyone's expectations. I'm not feeling any continued demand for this... But knowing about throttling would be nice - I don't know if CloudWatch shows that off the top of my head. > hook up AwsSdkMetrics to hadoop metrics > --- > > Key: HADOOP-13551 > URL: https://issues.apache.org/jira/browse/HADOOP-13551 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0-beta1 >Reporter: Steve Loughran >Assignee: Sean Mackrory >Priority: Major > > There's an API in {{com.amazonaws.metrics.AwsSdkMetrics}} to give access to > the internal metrics of the AWS libraries. We might want to get at those -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15729: --- Status: Patch Available (was: Open) > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch, HADOOP-15729.002.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872630#comment-16872630 ] Sean Mackrory commented on HADOOP-15729: So I tried out some of the same 1 GB uploads in the same region. Could vary that more, but as it is I think I'm just seeing a lot of noise in the underlying performance. I *tended* to see that separating core thread and max thread counts resulted in worse performance - I mainly tested with 50 max threads, and any combination of 0 core threads, 10 core threads, prestarting or not, all *tended* to have very similar decreases in performance. I'm not sure how to explain that - except for the "tended" - once in a while with an identical configuration I'd see the same drop in performance (in the best case, upload takes ~25 seconds, when there was a drop in performance it was in the neighborhood of 35 - 45 seconds). So it'd be quite the task to do a statistically rigorous experiment on this... So here's what I propose: what really shouldn't, and in my testing *tended* not to, have any impact on the short-term performance characteristics but would also completely solve the problem long-running processes have, is simply allowing core threads anywhere to time out. We already have a timeout configured, we just only use it for the BlockingThreadPoolExecutorService, and not for the unbounded threadpool we give to Amazon. I think this is the safe and right choice. Patch attached. Side note: I'm seeing that I can't mv or cp on S3 because it says the destination exists when it doesn't, and I'm getting some Maven errors packaging stuff in S3AFileSystem because of JavaDoc errors (an errant <, and a missing param). I'll follow up on those in separate JIRAs. > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch, HADOOP-15729.002.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15729: --- Attachment: HADOOP-15729.002.patch > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch, HADOOP-15729.002.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872396#comment-16872396 ] Sean Mackrory edited comment on HADOOP-15729 at 6/25/19 2:55 PM: - Having slept on it, I think we should go ahead and undeprecate fs.s3a.threads.core (just saw in the code we had it once) but have it default to 0. A non-zero default would either be dangerously high (I know that some have set it to low values to work around the problem I described above) and we can't have core > max, or it'll be uselessly low with a value of 1 or 2. But the more I think about this the less averse I am to making it tunable right away, especially since the property has existed before but has been unused for some time. edit: it's actually only deprecated in branch-2 for compatibility reasons. In trunk, it's gone entirely. This happened at the time of the BlockingThreadPoolExecutor implementation. was (Author: mackrorysd): Having slept on it, I think we should go ahead and undeprecate fs.s3a.threads.core (just saw in the code we had it once) but have it default to 0. A non-zero default is dangerous because if anyone has set max threads to 1 (and I know some people have set it to low values - albeit not THAT low - to work around the problem I described above) and suddenly core threads is, say, 5, you'll suddenly start getting exceptions on upgrade (or we'll have to handle that exception and rather quietly fail over to 0 core threads, which doesn't seem like expected behavior). But the more I think about this the less averse I am to making it tunable right away, especially since the property has existed before but has been unused for some time. edit: it's actually only deprecated in branch-2 for compatibility reasons. In trunk, it's gone entirely. This happened at the time of the BlockingThreadPoolExecutor implementation. > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872396#comment-16872396 ] Sean Mackrory edited comment on HADOOP-15729 at 6/25/19 2:32 PM: - Having slept on it, I think we should go ahead and undeprecate fs.s3a.threads.core (just saw in the code we had it once) but have it default to 0. A non-zero default is dangerous because if anyone has set max threads to 1 (and I know some people have set it to low values - albeit not THAT low - to work around the problem I described above) and suddenly core threads is, say, 5, you'll suddenly start getting exceptions on upgrade (or we'll have to handle that exception and rather quietly fail over to 0 core threads, which doesn't seem like expected behavior). But the more I think about this the less averse I am to making it tunable right away, especially since the property has existed before but has been unused for some time. edit: it's actually only deprecated in branch-2 for compatibility reasons. In trunk, it's gone entirely. This happened at the time of the BlockingThreadPoolExecutor implementation. was (Author: mackrorysd): Having slept on it, I think we should go ahead and undeprecate fs.s3a.threads.core (just saw in the code we had it once) but have it default to 0. A non-zero default is dangerous because if anyone has set max threads to 1 (and I know some people have set it to low values - albeit not THAT low - to work around the problem I described above) and suddenly core threads is, say, 5, you'll suddenly start getting exceptions on upgrade (or we'll have to handle that exception and rather quietly fail over to 0 core threads, which doesn't seem like expected behavior). But the more I think about this the less averse I am to making it tunable right away, especially since the property has existed before but has been unused for some time. > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872396#comment-16872396 ] Sean Mackrory commented on HADOOP-15729: Having slept on it, I think we should go ahead and undeprecate fs.s3a.threads.core (just saw in the code we had it once) but have it default to 0. A non-zero default is dangerous because if anyone has set max threads to 1 (and I know some people have set it to low values - albeit not THAT low - to work around the problem I described above) and suddenly core threads is, say, 5, you'll suddenly start getting exceptions on upgrade (or we'll have to handle that exception and rather quietly fail over to 0 core threads, which doesn't seem like expected behavior). But the more I think about this the less averse I am to making it tunable right away, especially since the property has existed before but has been unused for some time. > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871830#comment-16871830 ] Sean Mackrory commented on HADOOP-15729: Attach a patch with my proposed change. Unfortunately, it has a bigger impact on simple things like 1 GB upload than I thought, although it's hard to be sure it's not noise. See below for numbers to upload a 1GB file from a machine in us-west-2 to a bucket / pre-created DynamoDB table in the same region. Maybe this is worth adding fs.s3a.core.threads with a moderate default value. Long-running processes (like Hive Server2) might access many buckets and fs.s3a.max.threads grows absolutely out of control - core threads could still do the same unless it's *much* lower, in which case you'd easily hit this performance regression anyway. I would suggest we just proceed and consider fs.s3a.core.threads if further performance testing reveals an issue. Thoughts? Without change: {code} real0m27.415s user0m25.128s sys 0m6.377s real0m25.360s user0m25.081s sys 0m6.368s real0m27.615s user0m25.296s sys 0m6.015s real0m25.001s user0m25.408s sys 0m6.717s real0m28.083s user0m24.764s sys 0m5.774s real0m26.117s user0m25.192s sys 0m5.867s {code} With change: {code} real0m28.928s user0m24.182s sys 0m5.699s real0m33.359s user0m25.508s sys 0m6.407s real0m44.412s user0m24.565s sys 0m6.226s real0m27.469s user0m25.326s sys 0m6.142s real0m35.660s user0m25.206s sys 0m6.154s real0m31.811s user0m25.042s sys 0m6.057s {code} > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum
[ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15729: --- Attachment: HADOOP-15729.001.patch > [s3a] stop treat fs.s3a.max.threads as the long-term minimum > > > Key: HADOOP-15729 > URL: https://issues.apache.org/jira/browse/HADOOP-15729 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HADOOP-15729.001.patch > > > A while ago the s3a connector started experiencing deadlocks because the AWS > SDK requires an unbounded threadpool. It places monitoring tasks on the work > queue before the tasks they wait on, so it's possible (has even happened with > larger-than-default threadpools) for the executor to become permanently > saturated and deadlock. > So we started giving an unbounded threadpool executor to the SDK, and using a > bounded, blocking threadpool service for everything else S3A needs (although > currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then > only limits this threadpool, however we also specified fs.s3a.max.threads as > the number of core threads in the unbounded threadpool, which in hindsight is > pretty terrible. > Currently those core threads do not timeout, so this is actually setting a > sort of minimum. Once that many tasks have been submitted, the threadpool > will be locked at that number until it bursts beyond that, but it will only > spin down that far. If fs.s3a.max.threads is set reasonably high and someone > uses a bunch of S3 buckets, they could easily have thousands of idle threads > constantly. > We should either not use fs.s3a.max.threads for the corepool size and > introduce a new configuration, or we should simply allow core threads to > timeout. I'm reading the OpenJDK source now to see what subtle differences > there are between core threads and other threads if core threads can timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16211) Update guava to 27.0-jre in hadoop-project branch-3.2
[ https://issues.apache.org/jira/browse/HADOOP-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-16211: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Update guava to 27.0-jre in hadoop-project branch-3.2 > - > > Key: HADOOP-16211 > URL: https://issues.apache.org/jira/browse/HADOOP-16211 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.2.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-16211-branch-3.2.001.patch, > HADOOP-16211-branch-3.2.002.patch, HADOOP-16211-branch-3.2.003.patch, > HADOOP-16211-branch-3.2.004.patch, HADOOP-16211-branch-3.2.005.patch, > HADOOP-16211-branch-3.2.006.patch > > > com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found > CVE-2018-10237. > This is a sub-task for branch-3.2 from HADOOP-15960 to track issues on that > particular branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16211) Update guava to 27.0-jre in hadoop-project branch-3.2
[ https://issues.apache.org/jira/browse/HADOOP-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863098#comment-16863098 ] Sean Mackrory commented on HADOOP-16211: Thanks for getting to the bottom of that. Will commit, pending verifying a good build locally still... > Update guava to 27.0-jre in hadoop-project branch-3.2 > - > > Key: HADOOP-16211 > URL: https://issues.apache.org/jira/browse/HADOOP-16211 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.2.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-16211-branch-3.2.001.patch, > HADOOP-16211-branch-3.2.002.patch, HADOOP-16211-branch-3.2.003.patch, > HADOOP-16211-branch-3.2.004.patch, HADOOP-16211-branch-3.2.005.patch, > HADOOP-16211-branch-3.2.006.patch > > > com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found > CVE-2018-10237. > This is a sub-task for branch-3.2 from HADOOP-15960 to track issues on that > particular branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16213) Update guava to 27.0-jre in hadoop-project branch-3.1
[ https://issues.apache.org/jira/browse/HADOOP-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-16213: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Update guava to 27.0-jre in hadoop-project branch-3.1 > - > > Key: HADOOP-16213 > URL: https://issues.apache.org/jira/browse/HADOOP-16213 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.1.0, 3.1.1, 3.1.2 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Critical > Attachments: HADOOP-16213-branch-3.1.001.patch, > HADOOP-16213-branch-3.1.002.patch, HADOOP-16213-branch-3.1.003.patch, > HADOOP-16213-branch-3.1.004.patch, HADOOP-16213-branch-3.1.005.patch, > HADOOP-16213-branch-3.1.006.patch > > > com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found > CVE-2018-10237. > This is a sub-task for branch-3.1 from HADOOP-15960 to track issues on that > particular branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16213) Update guava to 27.0-jre in hadoop-project branch-3.1
[ https://issues.apache.org/jira/browse/HADOOP-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863097#comment-16863097 ] Sean Mackrory commented on HADOOP-16213: Thanks for getting to the bottom of that. Build looks good to me otherwise. Committed! > Update guava to 27.0-jre in hadoop-project branch-3.1 > - > > Key: HADOOP-16213 > URL: https://issues.apache.org/jira/browse/HADOOP-16213 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.1.0, 3.1.1, 3.1.2 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Critical > Attachments: HADOOP-16213-branch-3.1.001.patch, > HADOOP-16213-branch-3.1.002.patch, HADOOP-16213-branch-3.1.003.patch, > HADOOP-16213-branch-3.1.004.patch, HADOOP-16213-branch-3.1.005.patch, > HADOOP-16213-branch-3.1.006.patch > > > com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found > CVE-2018-10237. > This is a sub-task for branch-3.1 from HADOOP-15960 to track issues on that > particular branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16213) Update guava to 27.0-jre in hadoop-project branch-3.1
[ https://issues.apache.org/jira/browse/HADOOP-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858775#comment-16858775 ] Sean Mackrory commented on HADOOP-16213: Same as HADOOP-16211 - +1 on the code, but there's a different set of hadoop-yarn failures we need to check are actually flaky. > Update guava to 27.0-jre in hadoop-project branch-3.1 > - > > Key: HADOOP-16213 > URL: https://issues.apache.org/jira/browse/HADOOP-16213 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.1.0, 3.1.1, 3.1.2 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Critical > Attachments: HADOOP-16213-branch-3.1.001.patch, > HADOOP-16213-branch-3.1.002.patch, HADOOP-16213-branch-3.1.003.patch, > HADOOP-16213-branch-3.1.004.patch, HADOOP-16213-branch-3.1.005.patch, > HADOOP-16213-branch-3.1.006.patch > > > com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found > CVE-2018-10237. > This is a sub-task for branch-3.1 from HADOOP-15960 to track issues on that > particular branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16211) Update guava to 27.0-jre in hadoop-project branch-3.2
[ https://issues.apache.org/jira/browse/HADOOP-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858767#comment-16858767 ] Sean Mackrory commented on HADOOP-16211: +1 on the patch, but I don't immediately recognize the hadoop-yarn test failures as being likely flakes. Have you looked into that already? Need to confirm they're not caused by some change in Guava. > Update guava to 27.0-jre in hadoop-project branch-3.2 > - > > Key: HADOOP-16211 > URL: https://issues.apache.org/jira/browse/HADOOP-16211 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.2.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-16211-branch-3.2.001.patch, > HADOOP-16211-branch-3.2.002.patch, HADOOP-16211-branch-3.2.003.patch, > HADOOP-16211-branch-3.2.004.patch, HADOOP-16211-branch-3.2.005.patch, > HADOOP-16211-branch-3.2.006.patch > > > com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found > CVE-2018-10237. > This is a sub-task for branch-3.2 from HADOOP-15960 to track issues on that > particular branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16212) Update guava to 27.0-jre in hadoop-project branch-3.0
[ https://issues.apache.org/jira/browse/HADOOP-16212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-16212: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed. As discussed offline: we need to remember to re-email the other communities' mailing lists when you're done with the 3.x branches to remind them. > Update guava to 27.0-jre in hadoop-project branch-3.0 > - > > Key: HADOOP-16212 > URL: https://issues.apache.org/jira/browse/HADOOP-16212 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.0.3 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-16212-branch-3.0.001.patch, > HADOOP-16212-branch-3.0.002.patch, HADOOP-16212-branch-3.0.003.patch, > HADOOP-16212-branch-3.0.004.patch, HADOOP-16212-branch-3.0.005.patch > > > com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found > CVE-2018-10237. > This is a sub-task for branch-3.0 from HADOOP-15960 to track issues on that > particular branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16212) Update guava to 27.0-jre in hadoop-project branch-3.0
[ https://issues.apache.org/jira/browse/HADOOP-16212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16854601#comment-16854601 ] Sean Mackrory commented on HADOOP-16212: +1. Will commit... > Update guava to 27.0-jre in hadoop-project branch-3.0 > - > > Key: HADOOP-16212 > URL: https://issues.apache.org/jira/browse/HADOOP-16212 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.0.3 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-16212-branch-3.0.001.patch, > HADOOP-16212-branch-3.0.002.patch, HADOOP-16212-branch-3.0.003.patch, > HADOOP-16212-branch-3.0.004.patch, HADOOP-16212-branch-3.0.005.patch > > > com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found > CVE-2018-10237. > This is a sub-task for branch-3.0 from HADOOP-15960 to track issues on that > particular branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-16321) ITestS3ASSL+TestOpenSSLSocketFactory failing with java.lang.UnsatisfiedLinkErrors
[ https://issues.apache.org/jira/browse/HADOOP-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory resolved HADOOP-16321. Resolution: Fixed > ITestS3ASSL+TestOpenSSLSocketFactory failing with > java.lang.UnsatisfiedLinkErrors > - > > Key: HADOOP-16321 > URL: https://issues.apache.org/jira/browse/HADOOP-16321 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3, test >Affects Versions: 3.3.0 > Environment: macos >Reporter: Steve Loughran >Assignee: Sahil Takiar >Priority: Major > > the new test of HADOOP-16050 {{ITestS3ASSL}} is failing with > {{java.lang.UnsatisfiedLinkError}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations
[ https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840846#comment-16840846 ] Sean Mackrory commented on HADOOP-15999: {quote} Seems like a common case{quote} Yeah that's actually the exact case that motivated this ticket, IIRC... > S3Guard: Better support for out-of-band operations > -- > > Key: HADOOP-15999 > URL: https://issues.apache.org/jira/browse/HADOOP-15999 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Sean Mackrory >Assignee: Gabor Bota >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, > HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, > HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, > HADOOP-15999.009.patch, out-of-band-operations.patch > > > S3Guard was initially done on the premise that a new MetadataStore would be > the source of truth, and that it wouldn't provide guarantees if updates were > done without using S3Guard. > I've been seeing increased demand for better support for scenarios where > operations are done on the data that can't reasonably be done with S3Guard > involved. For example: > * A file is deleted using S3Guard, and replaced by some other tool. S3Guard > can't tell the difference between the new file and delete / list > inconsistency and continues to treat the file as deleted. > * An S3Guard-ed file is overwritten by a longer file by some other tool. When > reading the file, only the length of the original file is read. > We could possibly have smarter behavior here by querying both S3 and the > MetadataStore (even in cases where we may currently only query the > MetadataStore in getFileStatus) and use whichever one has the higher modified > time. > This kills the performance boost we currently get in some workloads with the > short-circuited getFileStatus, but we could keep it with authoritative mode > which should give a larger performance boost. At least we'd get more > correctness without authoritative mode and a clear declaration of when we can > make the assumptions required to short-circuit the process. If we can't > consider S3Guard the source of truth, we need to defer to S3 more. > We'd need to be extra sure of any locality / time zone issues if we start > relying on mod_time more directly, but currently we're tracking the > modification time as returned by S3 anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15999) S3Guard: Better support for out-of-band operations
[ https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840846#comment-16840846 ] Sean Mackrory edited comment on HADOOP-15999 at 5/15/19 10:49 PM: -- {quote} Seems like a common case{quote} Yeah that's actually the exact case that motivated this ticket in the first case, IIRC... was (Author: mackrorysd): {quote} Seems like a common case{quote} Yeah that's actually the exact case that motivated this ticket, IIRC... > S3Guard: Better support for out-of-band operations > -- > > Key: HADOOP-15999 > URL: https://issues.apache.org/jira/browse/HADOOP-15999 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Sean Mackrory >Assignee: Gabor Bota >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, > HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, > HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, > HADOOP-15999.009.patch, out-of-band-operations.patch > > > S3Guard was initially done on the premise that a new MetadataStore would be > the source of truth, and that it wouldn't provide guarantees if updates were > done without using S3Guard. > I've been seeing increased demand for better support for scenarios where > operations are done on the data that can't reasonably be done with S3Guard > involved. For example: > * A file is deleted using S3Guard, and replaced by some other tool. S3Guard > can't tell the difference between the new file and delete / list > inconsistency and continues to treat the file as deleted. > * An S3Guard-ed file is overwritten by a longer file by some other tool. When > reading the file, only the length of the original file is read. > We could possibly have smarter behavior here by querying both S3 and the > MetadataStore (even in cases where we may currently only query the > MetadataStore in getFileStatus) and use whichever one has the higher modified > time. > This kills the performance boost we currently get in some workloads with the > short-circuited getFileStatus, but we could keep it with authoritative mode > which should give a larger performance boost. At least we'd get more > correctness without authoritative mode and a clear declaration of when we can > make the assumptions required to short-circuit the process. If we can't > consider S3Guard the source of truth, we need to defer to S3 more. > We'd need to be extra sure of any locality / time zone issues if we start > relying on mod_time more directly, but currently we're tracking the > modification time as returned by S3 anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16281) ABFS: Rename operation, GetFileStatus before rename operation and throw exception on the driver side
[ https://issues.apache.org/jira/browse/HADOOP-16281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832524#comment-16832524 ] Sean Mackrory commented on HADOOP-16281: [~DanielZhou] I actually hadn't thought of it being applicable to Azure because the WASB connector already had similar mechanisms, and ADLS Gen1 and Gen2 already offer the required file-system semantics. But then I remembered you can turn of Hierarchical Namespace :) If people want to run HBase without that feature for some reason, then yes, I'd love to make sure it supports ABFS well. > ABFS: Rename operation, GetFileStatus before rename operation and throw > exception on the driver side > - > > Key: HADOOP-16281 > URL: https://issues.apache.org/jira/browse/HADOOP-16281 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.2.0 >Reporter: Da Zhou >Assignee: Da Zhou >Priority: Major > > ABFS should add the rename with options: > [https://github.com/apache/hadoop/pull/743] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16222) Fix new deprecations after guava 27.0 update in trunk
[ https://issues.apache.org/jira/browse/HADOOP-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-16222: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Fix new deprecations after guava 27.0 update in trunk > - > > Key: HADOOP-16222 > URL: https://issues.apache.org/jira/browse/HADOOP-16222 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-16222.001.patch, HADOOP-16222.002.patch, > HADOOP-16222.003.patch, HADOOP-16222.004.patch > > > *Note*: this can be done after the guava update. > There are a bunch of new deprecations after the guava update. We need to fix > those, because these will be removed after the next guava version (after 27). > I created a separate jira for this from HADOOP-16210 because jenkins > pre-commit test job (yetus) will time-out after 5 hours after running this > together. > {noformat} > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SemaphoredDelegatingExecutor.java:[110,20] > [deprecation] immediateFailedCheckedFuture(X) in Futures has been > deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ZKUtil.java:[175,16] > [deprecation] toString(File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestTableMapping.java:[44,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestTableMapping.java:[67,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestTableMapping.java:[131,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestTableMapping.java:[150,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestTableMapping.java:[169,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestZKUtil.java:[134,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestSecurityUtil.java:[437,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSHAAdminMiniCluster.java:[211,26] > [deprecation] toString(File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSHAAdminMiniCluster.java:[219,36] > [deprecation] toString(File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/fpga/TestFpgaResourceHandler.java:[130,9] > [deprecation] append(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/fpga/TestFpgaResourceHandler.java:[352,9] > [deprecation] append(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java:[1161,18] > [deprecation] propagate(Throwable) in Throwables has been deprecated > [WARNING] > /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/ServiceTestUtils.java:[413,18] > [deprecation] propagate(Throwable) in Throwables has been deprecated > {noformat} > Maybe fix these by module by module instead of a single patch? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To
[jira] [Commented] (HADOOP-16222) Fix new deprecations after guava 27.0 update in trunk
[ https://issues.apache.org/jira/browse/HADOOP-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825327#comment-16825327 ] Sean Mackrory commented on HADOOP-16222: +1. Note that it's no longer clean - YARN-9495 already committed the findbugs-exclude.xml changes, so only committing the rest of the patch. > Fix new deprecations after guava 27.0 update in trunk > - > > Key: HADOOP-16222 > URL: https://issues.apache.org/jira/browse/HADOOP-16222 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-16222.001.patch, HADOOP-16222.002.patch, > HADOOP-16222.003.patch, HADOOP-16222.004.patch > > > *Note*: this can be done after the guava update. > There are a bunch of new deprecations after the guava update. We need to fix > those, because these will be removed after the next guava version (after 27). > I created a separate jira for this from HADOOP-16210 because jenkins > pre-commit test job (yetus) will time-out after 5 hours after running this > together. > {noformat} > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SemaphoredDelegatingExecutor.java:[110,20] > [deprecation] immediateFailedCheckedFuture(X) in Futures has been > deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ZKUtil.java:[175,16] > [deprecation] toString(File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestTableMapping.java:[44,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestTableMapping.java:[67,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestTableMapping.java:[131,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestTableMapping.java:[150,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestTableMapping.java:[169,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestZKUtil.java:[134,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestSecurityUtil.java:[437,9] > [deprecation] write(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSHAAdminMiniCluster.java:[211,26] > [deprecation] toString(File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSHAAdminMiniCluster.java:[219,36] > [deprecation] toString(File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/fpga/TestFpgaResourceHandler.java:[130,9] > [deprecation] append(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/fpga/TestFpgaResourceHandler.java:[352,9] > [deprecation] append(CharSequence,File,Charset) in Files has been deprecated > [WARNING] > /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java:[1161,18] > [deprecation] propagate(Throwable) in Throwables has been deprecated > [WARNING] > /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/ServiceTestUtils.java:[413,18] > [deprecation] propagate(Throwable) in Throwables has been deprecated > {noformat} > Maybe fix these by module by module instead of a single patch? -- This message was
[jira] [Comment Edited] (HADOOP-16085) S3Guard: use object version or etags to protect against inconsistent read after replace/overwrite
[ https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820236#comment-16820236 ] Sean Mackrory edited comment on HADOOP-16085 at 4/17/19 3:58 PM: - Left some feedback in-line on the pull-request (and for HADOOP-16221 too). Some more general thoughts: * Have been discussing with [~ste...@apache.org] whether or not the FileStatus -> S3AFileStatus and schema changes should be separated out from the enforcement. I think the best argument for that is that it's a smaller change to get older clients to notify newer clients of changes whereas only the newer ones will enforce. The other factor mentioned is the desire for keeping S3Guard relatively storage-agnostic, but I honestly just don't see how we can do that and still have a robust solution. S3 is popular enough to warrant a custom solution that really does fix all the holes. Personally, I think we should just keep this change together. * I don't suppose there's an interface we can rely on to provide getETag() and getVersionId(), is there? This is where Go's duck-typing would be nice so we could eliminate 2 (or more) or the args to every constructor call. Not a big deal. I have a small to do list of other little things to look into but as you'll see on the PR, the overwhelming majority of my feedback is pretty mechanical. I think overall this is looking like a good solid patch. I am also getting some unit test failures running this in CDH. Will do some more test runs on the upstream base and with various parameters to see if I can narrow it down. I assume you've been running the tests with no problems? was (Author: mackrorysd): Left some feedback in-line on the pull-request (and for HADOOP-16221 too). Some more general thoughts: * Have been discussing with [~ste...@apache.org] whether or not the FileStatus -> S3AFileStatus and schema changes should be separated out from the enforcement. I think the best argument for that is that it's a smaller change to get older clients to notify newer clients of changes whereas only the newer ones will enforce. The other factor mentioned is the desire for keeping S3Guard relatively storage-agnostic, but I honestly just don't see how we can do that and still have a robust solution. S3 is popular enough to warrant a custom solution that really does fix all the holes. Personally, I think we should just keep this change together. * I don't suppose there's an interface we can rely on to provide getETag() and getVersionId(), is there? This is where Go's duck-typing would be nice so we could eliminate 2 (or more) or the args to every constructor call. Not a big deal. I have a small to do list of other little things to look into but as you'll see on the PR, the overwhelming majority of my feedback is pretty mechanical. I think overall this is looking like a good solid patch. > S3Guard: use object version or etags to protect against inconsistent read > after replace/overwrite > - > > Key: HADOOP-16085 > URL: https://issues.apache.org/jira/browse/HADOOP-16085 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.2.0 >Reporter: Ben Roling >Assignee: Ben Roling >Priority: Major > Attachments: HADOOP-16085-003.patch, HADOOP-16085_002.patch, > HADOOP-16085_3.2.0_001.patch > > > Currently S3Guard doesn't track S3 object versions. If a file is written in > S3A with S3Guard and then subsequently overwritten, there is no protection > against the next reader seeing the old version of the file instead of the new > one. > It seems like the S3Guard metadata could track the S3 object version. When a > file is created or updated, the object version could be written to the > S3Guard metadata. When a file is read, the read out of S3 could be performed > by object version, ensuring the correct version is retrieved. > I don't have a lot of direct experience with this yet, but this is my > impression from looking through the code. My organization is looking to > shift some datasets stored in HDFS over to S3 and is concerned about this > potential issue as there are some cases in our codebase that would do an > overwrite. > I imagine this idea may have been considered before but I couldn't quite > track down any JIRAs discussing it. If there is one, feel free to close this > with a reference to it. > Am I understanding things correctly? Is this idea feasible? Any feedback > that could be provided would be appreciated. We may consider crafting a > patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail:
[jira] [Commented] (HADOOP-16085) S3Guard: use object version or etags to protect against inconsistent read after replace/overwrite
[ https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820236#comment-16820236 ] Sean Mackrory commented on HADOOP-16085: Left some feedback in-line on the pull-request (and for HADOOP-16221 too). Some more general thoughts: * Have been discussing with [~ste...@apache.org] whether or not the FileStatus -> S3AFileStatus and schema changes should be separated out from the enforcement. I think the best argument for that is that it's a smaller change to get older clients to notify newer clients of changes whereas only the newer ones will enforce. The other factor mentioned is the desire for keeping S3Guard relatively storage-agnostic, but I honestly just don't see how we can do that and still have a robust solution. S3 is popular enough to warrant a custom solution that really does fix all the holes. Personally, I think we should just keep this change together. * I don't suppose there's an interface we can rely on to provide getETag() and getVersionId(), is there? This is where Go's duck-typing would be nice so we could eliminate 2 (or more) or the args to every constructor call. Not a big deal. I have a small to do list of other little things to look into but as you'll see on the PR, the overwhelming majority of my feedback is pretty mechanical. I think overall this is looking like a good solid patch. > S3Guard: use object version or etags to protect against inconsistent read > after replace/overwrite > - > > Key: HADOOP-16085 > URL: https://issues.apache.org/jira/browse/HADOOP-16085 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.2.0 >Reporter: Ben Roling >Assignee: Ben Roling >Priority: Major > Attachments: HADOOP-16085-003.patch, HADOOP-16085_002.patch, > HADOOP-16085_3.2.0_001.patch > > > Currently S3Guard doesn't track S3 object versions. If a file is written in > S3A with S3Guard and then subsequently overwritten, there is no protection > against the next reader seeing the old version of the file instead of the new > one. > It seems like the S3Guard metadata could track the S3 object version. When a > file is created or updated, the object version could be written to the > S3Guard metadata. When a file is read, the read out of S3 could be performed > by object version, ensuring the correct version is retrieved. > I don't have a lot of direct experience with this yet, but this is my > impression from looking through the code. My organization is looking to > shift some datasets stored in HDFS over to S3 and is concerned about this > potential issue as there are some cases in our codebase that would do an > overwrite. > I imagine this idea may have been considered before but I couldn't quite > track down any JIRAs discussing it. If there is one, feel free to close this > with a reference to it. > Am I understanding things correctly? Is this idea feasible? Any feedback > that could be provided would be appreciated. We may consider crafting a > patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16246) Unbounded thread pool maximum pool size in S3AFileSystem TransferManager
[ https://issues.apache.org/jira/browse/HADOOP-16246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815024#comment-16815024 ] Sean Mackrory commented on HADOOP-16246: Yeah it was actually almost the reverse, IIRC. A couple of layers of monitoring tasks would get pushed first, and then the tasks that they were monitoring would get pushed after that. It did seem terrible, so I'm happy to hear it's been rethought. > Unbounded thread pool maximum pool size in S3AFileSystem TransferManager > > > Key: HADOOP-16246 > URL: https://issues.apache.org/jira/browse/HADOOP-16246 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Greg Kinman >Priority: Major > Original Estimate: 24h > Remaining Estimate: 24h > > I have something running in production that is running up on {{ulimit}} > trying to create {{s3a-transfer-unbounded}} threads. > Relevant background: https://issues.apache.org/jira/browse/HADOOP-13826. > Before that change, the thread pool used in the {{TransferManager}} had both > a reasonably small maximum pool size and work queue capacity. > After that change, the thread pool has both a maximum pool size and work > queue capacity of {{Integer.MAX_VALUE}}. > This seems like a pretty bad idea, because now we have, practically speaking, > no bound on the number of threads that might get created. I understand the > change was made in response to experiencing deadlocks and at the warning of > the documentation, which I will repeat here: > {quote}It is not recommended to use a single threaded executor or a thread > pool with a bounded work queue as control tasks may submit subtasks that > can't complete until all sub tasks complete. Using an incorrectly configured > thread pool may cause a deadlock (I.E. the work queue is filled with control > tasks that can't finish until subtasks complete but subtasks can't execute > because the queue is filled). > {quote} > The documentation only warns against having a bounded _work queue_, not > against having a bounded _maximum pool size_. And this seems fine, as having > an unbounded work queue sounds ok. Having an unbounded maximum pool size, > however, does not. > I will also note that this constructor is now deprecated and suggests using > {{TransferManagerBuilder}} instead, which by default creates a fixed thread > pool of size 10: > [https://github.com/aws/aws-sdk-java/blob/1.11.534/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/transfer/internal/TransferManagerUtils.java#L59]. > I suggest we make a small change here and keep the maximum pool size at > {{maxThreads}}, which defaults to 10, while keeping the work queue as is > (unbounded). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16246) Unbounded thread pool maximum pool size in S3AFileSystem TransferManager
[ https://issues.apache.org/jira/browse/HADOOP-16246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814954#comment-16814954 ] Sean Mackrory commented on HADOOP-16246: {quote}The documentation only warns against having a bounded work queue, not against having a bounded maximum pool size. {quote} The deadlocks we experienced were actually because the pool itself was completely saturated with tasks that depended on tasks that were in the queue. Unbounded queues wouldn't fix that. I haven't looked at TransferManagerBuilder, though - will take a look. > Unbounded thread pool maximum pool size in S3AFileSystem TransferManager > > > Key: HADOOP-16246 > URL: https://issues.apache.org/jira/browse/HADOOP-16246 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Greg Kinman >Priority: Major > Original Estimate: 24h > Remaining Estimate: 24h > > I have something running in production that is running up on {{ulimit}} > trying to create {{s3a-transfer-unbounded}} threads. > Relevant background: https://issues.apache.org/jira/browse/HADOOP-13826. > Before that change, the thread pool used in the {{TransferManager}} had both > a reasonably small maximum pool size and work queue capacity. > After that change, the thread pool has both a maximum pool size and work > queue capacity of {{Integer.MAX_VALUE}}. > This seems like a pretty bad idea, because now we have, practically speaking, > no bound on the number of threads that might get created. I understand the > change was made in response to experiencing deadlocks and at the warning of > the documentation, which I will repeat here: > {quote}It is not recommended to use a single threaded executor or a thread > pool with a bounded work queue as control tasks may submit subtasks that > can't complete until all sub tasks complete. Using an incorrectly configured > thread pool may cause a deadlock (I.E. the work queue is filled with control > tasks that can't finish until subtasks complete but subtasks can't execute > because the queue is filled). > {quote} > The documentation only warns against having a bounded _work queue_, not > against having a bounded _maximum pool size_. And this seems fine, as having > an unbounded work queue sounds ok. Having an unbounded maximum pool size, > however, does not. > I will also note that this constructor is now deprecated and suggests using > {{TransferManagerBuilder}} instead, which by default creates a fixed thread > pool of size 10: > [https://github.com/aws/aws-sdk-java/blob/1.11.534/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/transfer/internal/TransferManagerUtils.java#L59]. > I suggest we make a small change here and keep the maximum pool size at > {{maxThreads}}, which defaults to 10, while keeping the work queue as is > (unbounded). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-16210) Update guava to 27.0-jre in hadoop-project trunk
[ https://issues.apache.org/jira/browse/HADOOP-16210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809130#comment-16809130 ] Sean Mackrory edited comment on HADOOP-16210 at 4/3/19 7:21 PM: {quote}-1 javac 19m 7s root generated 15 new + 1481 unchanged - 1 fixed = 1496 total (was 1482) {quote} I didn't notice this one before I pushed. We should eliminate those if we can while the issue is still fresh. Care to follow-up on that, [~gabor.bota]? (edit: actually Gabor already filed HADOOP-16222 for that, and it makes the issue of Yetus timing out before running all the tests worse) I should also note that there was a [DISCUSS] thread on the mailing list about this, but it's been no opposition, only support. I committed to trunk only - since you're still testing downstream, we should keep this open to backport to those branches as they seem to be confirmed ready. was (Author: mackrorysd): {quote}-1 javac 19m 7s root generated 15 new + 1481 unchanged - 1 fixed = 1496 total (was 1482) {quote} I didn't notice this one before I pushed. We should eliminate those if we can while the issue is still fresh. Care to follow-up on that, [~gabor.bota]? I should also note that there was a [DISCUSS] thread on the mailing list about this, but it's been no opposition, only support. I committed to trunk only - since you're still testing downstream, we should keep this open to backport to those branches as they seem to be confirmed ready. > Update guava to 27.0-jre in hadoop-project trunk > > > Key: HADOOP-16210 > URL: https://issues.apache.org/jira/browse/HADOOP-16210 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Critical > Attachments: HADOOP-16210.001.patch, > HADOOP-16210.002.findbugsfix.wip.patch, HADOOP-16210.002.patch, > HADOOP-16210.003.patch > > > com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found > CVE-2018-10237. > This is a sub-task for trunk from HADOOP-15960 to track issues with that > particular branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16210) Update guava to 27.0-jre in hadoop-project trunk
[ https://issues.apache.org/jira/browse/HADOOP-16210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809130#comment-16809130 ] Sean Mackrory commented on HADOOP-16210: {quote}-1 javac 19m 7s root generated 15 new + 1481 unchanged - 1 fixed = 1496 total (was 1482) {quote} I didn't notice this one before I pushed. We should eliminate those if we can while the issue is still fresh. Care to follow-up on that, [~gabor.bota]? I should also note that there was a [DISCUSS] thread on the mailing list about this, but it's been no opposition, only support. I committed to trunk only - since you're still testing downstream, we should keep this open to backport to those branches as they seem to be confirmed ready. > Update guava to 27.0-jre in hadoop-project trunk > > > Key: HADOOP-16210 > URL: https://issues.apache.org/jira/browse/HADOOP-16210 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Critical > Attachments: HADOOP-16210.001.patch, > HADOOP-16210.002.findbugsfix.wip.patch, HADOOP-16210.002.patch, > HADOOP-16210.003.patch > > > com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found > CVE-2018-10237. > This is a sub-task for trunk from HADOOP-15960 to track issues with that > particular branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16210) Update guava to 27.0-jre in hadoop-project trunk
[ https://issues.apache.org/jira/browse/HADOOP-16210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809125#comment-16809125 ] Sean Mackrory commented on HADOOP-16210: The failure is the libprotoc version mismatch I occasionally see pop up, and that's always flaky when I try to look at it... +1 and pushed, as commented on the PR. I squashed the 2 commits to push, but I don't immediately see how you're supposed to close a PR that was merged out-of-band. Maybe it's the submitter that does that? > Update guava to 27.0-jre in hadoop-project trunk > > > Key: HADOOP-16210 > URL: https://issues.apache.org/jira/browse/HADOOP-16210 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Critical > Attachments: HADOOP-16210.001.patch, > HADOOP-16210.002.findbugsfix.wip.patch, HADOOP-16210.002.patch, > HADOOP-16210.003.patch > > > com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found > CVE-2018-10237. > This is a sub-task for trunk from HADOOP-15960 to track issues with that > particular branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16210) Update guava to 27.0-jre in hadoop-project trunk
[ https://issues.apache.org/jira/browse/HADOOP-16210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807140#comment-16807140 ] Sean Mackrory commented on HADOOP-16210: Beyond the findbugs issues, I'm supportive of this change. I also ran the Azure tests. There's more work to be done coordinating with downstream projects, but I think that can happen after it's submitted to trunk. We've gotta commit at some point to really see what else breaks that can't foresee. Pretty dangerous to be as far behind on dependencies as we are with this one - even if we're not affected by specific vulnerabilities, IMO. > Update guava to 27.0-jre in hadoop-project trunk > > > Key: HADOOP-16210 > URL: https://issues.apache.org/jira/browse/HADOOP-16210 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Critical > Attachments: HADOOP-16210.001.patch, HADOOP-16210.002.patch > > > com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found > CVE-2018-10237. > This is a sub-task for trunk from HADOOP-15960 to track issues with that > particular branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15625) S3A input stream to use etags to detect changed source files
[ https://issues.apache.org/jira/browse/HADOOP-15625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775467#comment-16775467 ] Sean Mackrory commented on HADOOP-15625: Just as a heads up for anyone else following along, the .003. patch looks like it's completely unrelated to the rest of this. > S3A input stream to use etags to detect changed source files > > > Key: HADOOP-15625 > URL: https://issues.apache.org/jira/browse/HADOOP-15625 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.2.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Major > Attachments: HADOOP-15625-001.patch, HADOOP-15625-002.patch, > HADOOP-15625-003.patch > > > S3A input stream doesn't handle changing source files any better than the > other cloud store connectors. Specifically: it doesn't noticed it has > changed, caches the length from startup, and whenever a seek triggers a new > GET, you may get one of: old data, new data, and even perhaps go from new > data to old data due to eventual consistency. > We can't do anything to stop this, but we could detect changes by > # caching the etag of the first HEAD/GET (we don't get that HEAD on open with > S3Guard, BTW) > # on future GET requests, verify the etag of the response > # raise an IOE if the remote file changed during the read. > It's a more dramatic failure, but it stops changes silently corrupting things. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15999) [s3a] Better support for out-of-band operations
[ https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774397#comment-16774397 ] Sean Mackrory commented on HADOOP-15999: FWIW I'm not seeing that test failure, but can't dig into it much myself right now. > [s3a] Better support for out-of-band operations > --- > > Key: HADOOP-15999 > URL: https://issues.apache.org/jira/browse/HADOOP-15999 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Sean Mackrory >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-15999.001.patch, HADOOP-15999.002.patch, > HADOOP-15999.003.patch, HADOOP-15999.004.patch, HADOOP-15999.005.patch, > HADOOP-15999.006.patch, out-of-band-operations.patch > > > S3Guard was initially done on the premise that a new MetadataStore would be > the source of truth, and that it wouldn't provide guarantees if updates were > done without using S3Guard. > I've been seeing increased demand for better support for scenarios where > operations are done on the data that can't reasonably be done with S3Guard > involved. For example: > * A file is deleted using S3Guard, and replaced by some other tool. S3Guard > can't tell the difference between the new file and delete / list > inconsistency and continues to treat the file as deleted. > * An S3Guard-ed file is overwritten by a longer file by some other tool. When > reading the file, only the length of the original file is read. > We could possibly have smarter behavior here by querying both S3 and the > MetadataStore (even in cases where we may currently only query the > MetadataStore in getFileStatus) and use whichever one has the higher modified > time. > This kills the performance boost we currently get in some workloads with the > short-circuited getFileStatus, but we could keep it with authoritative mode > which should give a larger performance boost. At least we'd get more > correctness without authoritative mode and a clear declaration of when we can > make the assumptions required to short-circuit the process. If we can't > consider S3Guard the source of truth, we need to defer to S3 more. > We'd need to be extra sure of any locality / time zone issues if we start > relying on mod_time more directly, but currently we're tracking the > modification time as returned by S3 anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16101) Use lighter-weight alternatives to innerGetFileStatus where possible
[ https://issues.apache.org/jira/browse/HADOOP-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16763826#comment-16763826 ] Sean Mackrory commented on HADOOP-16101: The other detail here is that we don't think we need to do a HEAD at all before doing the actual GET work when reading. > Use lighter-weight alternatives to innerGetFileStatus where possible > > > Key: HADOOP-16101 > URL: https://issues.apache.org/jira/browse/HADOOP-16101 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Sean Mackrory >Priority: Major > > Discussion in HADOOP-15999 highlighted the heaviness of a full > innerGetFileStatus call, where many usages of it may need a lighter weight > fileExists, etc. Let's investigate usage of innerGetFileStatus and slim it > down where possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15999) [s3a] Better support for out-of-band operations
[ https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16763800#comment-16763800 ] Sean Mackrory commented on HADOOP-15999: +1 on the patch. I filed HADOOP-15619 to make sure we don't lose Steve's suggestion of lighter weight alternatives to innerGetFileStatus where possible. {quote}any time we know we don't care 100% on being consistent{quote} I was talking to someone recently about slimming down some of the checks done on create, etc. (or at least make them optional). Otherwise the idea of knowing we might be less consistent than S3 itself scares me more than a little. Feels like whenever we've done that someone invariably hit the use case we thought was fine. > [s3a] Better support for out-of-band operations > --- > > Key: HADOOP-15999 > URL: https://issues.apache.org/jira/browse/HADOOP-15999 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Sean Mackrory >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-15999.001.patch, HADOOP-15999.002.patch, > out-of-band-operations.patch > > > S3Guard was initially done on the premise that a new MetadataStore would be > the source of truth, and that it wouldn't provide guarantees if updates were > done without using S3Guard. > I've been seeing increased demand for better support for scenarios where > operations are done on the data that can't reasonably be done with S3Guard > involved. For example: > * A file is deleted using S3Guard, and replaced by some other tool. S3Guard > can't tell the difference between the new file and delete / list > inconsistency and continues to treat the file as deleted. > * An S3Guard-ed file is overwritten by a longer file by some other tool. When > reading the file, only the length of the original file is read. > We could possibly have smarter behavior here by querying both S3 and the > MetadataStore (even in cases where we may currently only query the > MetadataStore in getFileStatus) and use whichever one has the higher modified > time. > This kills the performance boost we currently get in some workloads with the > short-circuited getFileStatus, but we could keep it with authoritative mode > which should give a larger performance boost. At least we'd get more > correctness without authoritative mode and a clear declaration of when we can > make the assumptions required to short-circuit the process. If we can't > consider S3Guard the source of truth, we need to defer to S3 more. > We'd need to be extra sure of any locality / time zone issues if we start > relying on mod_time more directly, but currently we're tracking the > modification time as returned by S3 anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16101) Use lighter-weight alternatives to innerGetFileStatus where possible
Sean Mackrory created HADOOP-16101: -- Summary: Use lighter-weight alternatives to innerGetFileStatus where possible Key: HADOOP-16101 URL: https://issues.apache.org/jira/browse/HADOOP-16101 Project: Hadoop Common Issue Type: Sub-task Reporter: Sean Mackrory Discussion in HADOOP-15999 highlighted the heaviness of a full innerGetFileStatus call, where many usages of it may need a lighter weight fileExists, etc. Let's investigate usage of innerGetFileStatus and slim it down where possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15582) Document ABFS
[ https://issues.apache.org/jira/browse/HADOOP-15582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16761081#comment-16761081 ] Sean Mackrory commented on HADOOP-15582: I can probably take care of this next week. > Document ABFS > - > > Key: HADOOP-15582 > URL: https://issues.apache.org/jira/browse/HADOOP-15582 > Project: Hadoop Common > Issue Type: Sub-task > Components: documentation, fs/azure >Affects Versions: 3.2.0 >Reporter: Steve Loughran >Assignee: Thomas Marquardt >Priority: Major > > Add documentation for abfs under > {{hadoop-tools/hadoop-azure/src/site/markdown}} > Possible topics include > * intro to scheme > * why abfs (link to MSDN, etc) > * config options > * switching from wasb/interop > * troubleshooting > testing.md should add a section on testing this stuff too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite
[ https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758726#comment-16758726 ] Sean Mackrory commented on HADOOP-16085: {quote}object version can be up to 1024 characters{quote} I'm less concerned about the space taken up in the metadata store - the problems I'm trying to remember were due to the amount of space in the S3 bucket itself. It was with repeated test runs, so there were similar filenames used many, many times (which is not that realistic, but the more you care about read-after-update consistency, the more this would impact you), so many versions had been kept. > S3Guard: use object version to protect against inconsistent read after > replace/overwrite > > > Key: HADOOP-16085 > URL: https://issues.apache.org/jira/browse/HADOOP-16085 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.2.0 >Reporter: Ben Roling >Priority: Major > Attachments: HADOOP-16085_3.2.0_001.patch > > > Currently S3Guard doesn't track S3 object versions. If a file is written in > S3A with S3Guard and then subsequently overwritten, there is no protection > against the next reader seeing the old version of the file instead of the new > one. > It seems like the S3Guard metadata could track the S3 object version. When a > file is created or updated, the object version could be written to the > S3Guard metadata. When a file is read, the read out of S3 could be performed > by object version, ensuring the correct version is retrieved. > I don't have a lot of direct experience with this yet, but this is my > impression from looking through the code. My organization is looking to > shift some datasets stored in HDFS over to S3 and is concerned about this > potential issue as there are some cases in our codebase that would do an > overwrite. > I imagine this idea may have been considered before but I couldn't quite > track down any JIRAs discussing it. If there is one, feel free to close this > with a reference to it. > Am I understanding things correctly? Is this idea feasible? Any feedback > that could be provided would be appreciated. We may consider crafting a > patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite
[ https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758672#comment-16758672 ] Sean Mackrory commented on HADOOP-16085: Thanks for submitting a patch [~ben.roling]. Haven't had a chance to do a full review yet, but one of [~fabbri]'s comments was also high on my list of things to watch out for: {quote}Backward / forward compatible with existing S3Guarded buckets and Dynamo tables.{quote} Specifically, we need to gracefully deal with any row missing an object version. The other direction is easy - if this simply adds a new field, old code will ignore it and we'll continue to get the current behavior. My other concern is that this requires enabling object versioning. I know [~fabbri] has done some testing with that and I think eventually hit issues. Was it just a matter of the space all the versions were taking up, or was it actually a performance problem once there was enough overhead? > S3Guard: use object version to protect against inconsistent read after > replace/overwrite > > > Key: HADOOP-16085 > URL: https://issues.apache.org/jira/browse/HADOOP-16085 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.2.0 >Reporter: Ben Roling >Priority: Major > Attachments: HADOOP-16085_3.2.0_001.patch > > > Currently S3Guard doesn't track S3 object versions. If a file is written in > S3A with S3Guard and then subsequently overwritten, there is no protection > against the next reader seeing the old version of the file instead of the new > one. > It seems like the S3Guard metadata could track the S3 object version. When a > file is created or updated, the object version could be written to the > S3Guard metadata. When a file is read, the read out of S3 could be performed > by object version, ensuring the correct version is retrieved. > I don't have a lot of direct experience with this yet, but this is my > impression from looking through the code. My organization is looking to > shift some datasets stored in HDFS over to S3 and is concerned about this > potential issue as there are some cases in our codebase that would do an > overwrite. > I imagine this idea may have been considered before but I couldn't quite > track down any JIRAs discussing it. If there is one, feel free to close this > with a reference to it. > Am I understanding things correctly? Is this idea feasible? Any feedback > that could be provided would be appreciated. We may consider crafting a > patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16041) UserAgent string for ABFS
[ https://issues.apache.org/jira/browse/HADOOP-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755525#comment-16755525 ] Sean Mackrory commented on HADOOP-16041: Committed. Confirmed, for everyone's reference, that the User-Agent now looks like this for me: {code}Azure Blob FS/3.3.0-SNAPSHOT (JavaJRE 1.8.0_191; Linux 4.15.0-43-generic){code} All tests pass, except for this one that's always being weird (although I thought it was just WASB compat tests that were being weird and this is different): {code}ITestGetNameSpaceEnabled.testNonXNSAccount:57->Assert.assertFalse:64->Assert.assertTrue:41->Assert.fail:88 Expecting getIsNamespaceEnabled() return false{code} > UserAgent string for ABFS > - > > Key: HADOOP-16041 > URL: https://issues.apache.org/jira/browse/HADOOP-16041 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.2.0 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-16041.001.patch, HADOOP-16041.002.patch, > HADOOP-16041.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-16041) UserAgent string for ABFS
[ https://issues.apache.org/jira/browse/HADOOP-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755439#comment-16755439 ] Sean Mackrory edited comment on HADOOP-16041 at 1/29/19 10:39 PM: -- +1. Will commit. (edit: as briefly discussed offline - there is now a space after the slash that appears to be unintentional) was (Author: mackrorysd): +1. Will commit. > UserAgent string for ABFS > - > > Key: HADOOP-16041 > URL: https://issues.apache.org/jira/browse/HADOOP-16041 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.2.0 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-16041.001.patch, HADOOP-16041.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16041) UserAgent string for ABFS
[ https://issues.apache.org/jira/browse/HADOOP-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755439#comment-16755439 ] Sean Mackrory commented on HADOOP-16041: +1. Will commit. > UserAgent string for ABFS > - > > Key: HADOOP-16041 > URL: https://issues.apache.org/jira/browse/HADOOP-16041 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.2.0 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-16041.001.patch, HADOOP-16041.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16041) UserAgent string for ABFS
[ https://issues.apache.org/jira/browse/HADOOP-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752888#comment-16752888 ] Sean Mackrory commented on HADOOP-16041: Looks pretty good to me. Test failure is unrelated and probably just flaky given it's fuzzy logic to begin with. We probably want it to be "ABFS/" + VersionInfo..., though, as [~tmarquardt] suggested. I'm fairly certain that the term ABFS uniquely refers to the Hadoop driver, right? So that would be a nice clean way to distinguish Hadoop / ABFS from other ADLS Gen2 clients since the specific formatting of the Hadoop version can vary. I'd also be fine with "Azure Blob FS/" so that we simply replace the 1.0 with the actual Hadoop version (unless 1.0 referred to an API version or something?) > UserAgent string for ABFS > - > > Key: HADOOP-16041 > URL: https://issues.apache.org/jira/browse/HADOOP-16041 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.2.0 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-16041.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15999) [s3a] Better support for out-of-band operations
[ https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742638#comment-16742638 ] Sean Mackrory commented on HADOOP-15999: After a preliminary review I think this looks pretty good. A couple of requests though: * Let's fix the checkstyle issues if you're not already doing that * Can we apply similar logic to when getting directory listings? I think it'll be quick enough to just include as part of this JIRA. I'll do another review when I'm less rushed & tired, but I don't see any other issues with the code right now. Good stuff - thank you. > [s3a] Better support for out-of-band operations > --- > > Key: HADOOP-15999 > URL: https://issues.apache.org/jira/browse/HADOOP-15999 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Sean Mackrory >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-15999.001.patch, out-of-band-operations.patch > > > S3Guard was initially done on the premise that a new MetadataStore would be > the source of truth, and that it wouldn't provide guarantees if updates were > done without using S3Guard. > I've been seeing increased demand for better support for scenarios where > operations are done on the data that can't reasonably be done with S3Guard > involved. For example: > * A file is deleted using S3Guard, and replaced by some other tool. S3Guard > can't tell the difference between the new file and delete / list > inconsistency and continues to treat the file as deleted. > * An S3Guard-ed file is overwritten by a longer file by some other tool. When > reading the file, only the length of the original file is read. > We could possibly have smarter behavior here by querying both S3 and the > MetadataStore (even in cases where we may currently only query the > MetadataStore in getFileStatus) and use whichever one has the higher modified > time. > This kills the performance boost we currently get in some workloads with the > short-circuited getFileStatus, but we could keep it with authoritative mode > which should give a larger performance boost. At least we'd get more > correctness without authoritative mode and a clear declaration of when we can > make the assumptions required to short-circuit the process. If we can't > consider S3Guard the source of truth, we need to defer to S3 more. > We'd need to be extra sure of any locality / time zone issues if we start > relying on mod_time more directly, but currently we're tracking the > modification time as returned by S3 anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16041) UserAgent string for ABFS
[ https://issues.apache.org/jira/browse/HADOOP-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740666#comment-16740666 ] Sean Mackrory commented on HADOOP-16041: In the case of CDH (and I believe Hortonworks / HDInsight) the version number includes an identification of the vendor and the vendor release. For example "3.0.0-cdh6.0.0". That's how the vendor has been identified in the other connectors as far as I'm aware - that's definitely all we did for ADLS Gen1 and I know the required information was still collected. In these Hadoop distros the user-configurable prefix is also modifiable by the user, and has been used (not on Azure specifically in the cases I'm thinking of) to identify workloads from a particular large user or something like that. It's not as good a fit for identifying the vendor. With the Databricks model where it's more of a service, then it does make sense to identify themselves using the configured prefix (which is actually a suffix right now) unless their version string embeds enough information. +1 to ABFS/, if that allows you to identify Hadoop vendors sufficiently well. > UserAgent string for ABFS > - > > Key: HADOOP-16041 > URL: https://issues.apache.org/jira/browse/HADOOP-16041 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.2.0 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16041) UserAgent string for ABFS
[ https://issues.apache.org/jira/browse/HADOOP-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739970#comment-16739970 ] Sean Mackrory commented on HADOOP-16041: {quote}As/when a successor to htrace is added, making it an option to dynamically add trace info would be nice, though as that is per-request it gets a bit more complex, especially with a thread pool to service requests.{quote} Yeah I'd suggest we tackle that as a separate feature when we have a clear idea of what that trace info would look like. Do you have like a compressed stack trace in mind, or more just a unique ID for each request or something? > UserAgent string for ABFS > - > > Key: HADOOP-16041 > URL: https://issues.apache.org/jira/browse/HADOOP-16041 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.2.0 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16041) UserAgent string for ABFS
[ https://issues.apache.org/jira/browse/HADOOP-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739969#comment-16739969 ] Sean Mackrory commented on HADOOP-16041: {quote}Partner Service{quote} To be clear, that's not part of the user agent string. I suspect you're getting that from verifyUserAgent, where it sets the configurable prefix to "Partner Service", but any Hadoop installation could set that to whatever they wanted. The Hadoop distribution in usage can actually be inferred from VersionInfo, which is how it's done in other cloud connectors already (and is actually what's motivating this change). We should coordinate with [~tmarquardt] to make sure there isn't any intended future use of Azure Blob FS/1.0, if that's supposed to map to some API version or something. Since we don't have a distinct SDK, Hadoop *is* the client, so the Hadoop version makes sense to me. But we can preserve Azure Blob FS/1.0 if that carries any special meaning for Microsoft. > UserAgent string for ABFS > - > > Key: HADOOP-16041 > URL: https://issues.apache.org/jira/browse/HADOOP-16041 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.2.0 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-16041) UserAgent string for ABFS
[ https://issues.apache.org/jira/browse/HADOOP-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739969#comment-16739969 ] Sean Mackrory edited comment on HADOOP-16041 at 1/11/19 2:49 AM: - {quote}Partner Service{quote} To be clear, that's not part of the user agent string. I suspect you're getting that from verifyUserAgent, where it sets the configurable prefix to "Partner Service", but any Hadoop installation could set that to whatever they wanted. The Hadoop distribution in usage can actually be inferred from VersionInfo, which is how it's done in other cloud connectors already (and is actually what's motivating this change). We should coordinate with [~tmarquardt] and [~DanielZhou] to make sure there isn't any intended future use of Azure Blob FS/1.0, if that's supposed to map to some API version or something. Since we don't have a distinct SDK, Hadoop *is* the client, so the Hadoop version makes sense to me. But we can preserve Azure Blob FS/1.0 if that carries any special meaning for Microsoft. was (Author: mackrorysd): {quote}Partner Service{quote} To be clear, that's not part of the user agent string. I suspect you're getting that from verifyUserAgent, where it sets the configurable prefix to "Partner Service", but any Hadoop installation could set that to whatever they wanted. The Hadoop distribution in usage can actually be inferred from VersionInfo, which is how it's done in other cloud connectors already (and is actually what's motivating this change). We should coordinate with [~tmarquardt] to make sure there isn't any intended future use of Azure Blob FS/1.0, if that's supposed to map to some API version or something. Since we don't have a distinct SDK, Hadoop *is* the client, so the Hadoop version makes sense to me. But we can preserve Azure Blob FS/1.0 if that carries any special meaning for Microsoft. > UserAgent string for ABFS > - > > Key: HADOOP-16041 > URL: https://issues.apache.org/jira/browse/HADOOP-16041 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.2.0 >Reporter: Shweta >Assignee: Shweta >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16027) [DOC] Effective use of FS instances during S3A integration tests
[ https://issues.apache.org/jira/browse/HADOOP-16027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-16027: --- Resolution: Fixed Fix Version/s: 3.3.0 Status: Resolved (was: Patch Available) > [DOC] Effective use of FS instances during S3A integration tests > > > Key: HADOOP-16027 > URL: https://issues.apache.org/jira/browse/HADOOP-16027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-16027.001.patch, HADOOP-16027.002.patch > > > While fixing HADOOP-15819 we found that a closed fs got into the static fs > cache during testing, which caused other tests to fail when the tests were > running sequentially. > We should document some best practices in the testing section on the s3 docs > with the following: > {panel} > Tests using FileSystems are fastest if they can recycle the existing FS > instance from the same JVM. If you do that, you MUST NOT close or do unique > configuration on them. If you want a guarantee of 100% isolation or an > instance with unique config, create a new instance > which you MUST close in the teardown to avoid leakage of resources. > Do not add FileSystem instances (with e.g > org.apache.hadoop.fs.FileSystem#addFileSystemForTesting) to the cache that > will be modified or closed during the test runs. This can cause other tests > to fail when using the same modified or closed FS instance. For more details > see HADOOP-15819. > {panel} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16027) [DOC] Effective use of FS instances during S3A integration tests
[ https://issues.apache.org/jira/browse/HADOOP-16027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738490#comment-16738490 ] Sean Mackrory commented on HADOOP-16027: +1, committed. Nit: added the empty lines between what appeared to be paragraphs. > [DOC] Effective use of FS instances during S3A integration tests > > > Key: HADOOP-16027 > URL: https://issues.apache.org/jira/browse/HADOOP-16027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-16027.001.patch, HADOOP-16027.002.patch > > > While fixing HADOOP-15819 we found that a closed fs got into the static fs > cache during testing, which caused other tests to fail when the tests were > running sequentially. > We should document some best practices in the testing section on the s3 docs > with the following: > {panel} > Tests using FileSystems are fastest if they can recycle the existing FS > instance from the same JVM. If you do that, you MUST NOT close or do unique > configuration on them. If you want a guarantee of 100% isolation or an > instance with unique config, create a new instance > which you MUST close in the teardown to avoid leakage of resources. > Do not add FileSystem instances (with e.g > org.apache.hadoop.fs.FileSystem#addFileSystemForTesting) to the cache that > will be modified or closed during the test runs. This can cause other tests > to fail when using the same modified or closed FS instance. For more details > see HADOOP-15819. > {panel} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15860) ABFS: Throw IllegalArgumentException when Directory/File name ends with a period(.)
[ https://issues.apache.org/jira/browse/HADOOP-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15860: --- Component/s: (was: fs/adl) fs/azure > ABFS: Throw IllegalArgumentException when Directory/File name ends with a > period(.) > --- > > Key: HADOOP-15860 > URL: https://issues.apache.org/jira/browse/HADOOP-15860 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.2.0 >Reporter: Sean Mackrory >Assignee: Shweta >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-15860.001.patch, HADOOP-15860.002.patch, > HADOOP-15860.003.patch, HADOOP-15860.004.patch, HADOOP-15860.005.patch, > HADOOP-15860.006.patch, trailing-periods.patch > > > If you create a directory with a trailing period (e.g. '/test.') the period > is silently dropped, and will be listed as simply '/test'. '/test.test' > appears to work just fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15860) ABFS: Throw IllegalArgumentException when Directory/File name ends with a period(.)
[ https://issues.apache.org/jira/browse/HADOOP-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15860: --- Resolution: Fixed Fix Version/s: 3.3.0 Status: Resolved (was: Patch Available) > ABFS: Throw IllegalArgumentException when Directory/File name ends with a > period(.) > --- > > Key: HADOOP-15860 > URL: https://issues.apache.org/jira/browse/HADOOP-15860 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/adl >Affects Versions: 3.2.0 >Reporter: Sean Mackrory >Assignee: Shweta >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-15860.001.patch, HADOOP-15860.002.patch, > HADOOP-15860.003.patch, HADOOP-15860.004.patch, HADOOP-15860.005.patch, > HADOOP-15860.006.patch, trailing-periods.patch > > > If you create a directory with a trailing period (e.g. '/test.') the period > is silently dropped, and will be listed as simply '/test'. '/test.test' > appears to work just fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15860) ABFS: Throw IllegalArgumentException when Directory/File name ends with a period(.)
[ https://issues.apache.org/jira/browse/HADOOP-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732191#comment-16732191 ] Sean Mackrory edited comment on HADOOP-15860 at 1/2/19 4:19 PM: +1 (although nit: I added whitespace before {'s where it was missing). I continue to see the "java.lang.AssertionError: Expecting getIsNamespaceEnabled() return false" failure, but I believe that is just because of features still rolling out to the service. I'm gonna blame the timeouts I saw on the car dealership I'm working from this morning - I've tested the previous patches enough to be very confident in this one. Thanks for the patch, Shweta. was (Author: mackrorysd): +1 (although nit: I added whitespace before {'s where it was missing). I continue to see the "java.lang.AssertionError: Expecting getIsNamespaceEnabled() return false" failure, but I believe that is just because of features still rolling out to the service. > ABFS: Throw IllegalArgumentException when Directory/File name ends with a > period(.) > --- > > Key: HADOOP-15860 > URL: https://issues.apache.org/jira/browse/HADOOP-15860 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/adl >Affects Versions: 3.2.0 >Reporter: Sean Mackrory >Assignee: Shweta >Priority: Major > Attachments: HADOOP-15860.001.patch, HADOOP-15860.002.patch, > HADOOP-15860.003.patch, HADOOP-15860.004.patch, HADOOP-15860.005.patch, > HADOOP-15860.006.patch, trailing-periods.patch > > > If you create a directory with a trailing period (e.g. '/test.') the period > is silently dropped, and will be listed as simply '/test'. '/test.test' > appears to work just fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15860) ABFS: Throw IllegalArgumentException when Directory/File name ends with a period(.)
[ https://issues.apache.org/jira/browse/HADOOP-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732191#comment-16732191 ] Sean Mackrory commented on HADOOP-15860: +1 (although nit: I added whitespace before {'s where it was missing). I continue to see the "java.lang.AssertionError: Expecting getIsNamespaceEnabled() return false" failure, but I believe that is just because of features still rolling out to the service. > ABFS: Throw IllegalArgumentException when Directory/File name ends with a > period(.) > --- > > Key: HADOOP-15860 > URL: https://issues.apache.org/jira/browse/HADOOP-15860 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/adl >Affects Versions: 3.2.0 >Reporter: Sean Mackrory >Assignee: Shweta >Priority: Major > Attachments: HADOOP-15860.001.patch, HADOOP-15860.002.patch, > HADOOP-15860.003.patch, HADOOP-15860.004.patch, HADOOP-15860.005.patch, > HADOOP-15860.006.patch, trailing-periods.patch > > > If you create a directory with a trailing period (e.g. '/test.') the period > is silently dropped, and will be listed as simply '/test'. '/test.test' > appears to work just fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15819) S3A integration test failures: FileSystem is closed! - without parallel test run
[ https://issues.apache.org/jira/browse/HADOOP-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732172#comment-16732172 ] Sean Mackrory commented on HADOOP-15819: I wouldn't have thought so as I think you have to go our of your way to do something you don't usually do to hit this, but then I don't fully understand why this was added in the first place. I think the patch originally came from you - do you happen to remember why we started adding the FS instance to the cache here? And if so, is that something that might occur to a future test writer as well? > S3A integration test failures: FileSystem is closed! - without parallel test > run > > > Key: HADOOP-15819 > URL: https://issues.apache.org/jira/browse/HADOOP-15819 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 3.1.1 >Reporter: Gabor Bota >Assignee: Adam Antal >Priority: Critical > Attachments: HADOOP-15819.000.patch, HADOOP-15819.001.patch, > HADOOP-15819.002.patch, S3ACloseEnforcedFileSystem.java, > S3ACloseEnforcedFileSystem.java, closed_fs_closers_example_5klines.log.zip > > > Running the integration tests for hadoop-aws {{mvn -Dscale verify}} against > Amazon AWS S3 (eu-west-1, us-west-1, with no s3guard) we see a lot of these > failures: > {noformat} > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.408 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob) > Time elapsed: 0.027 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 4.345 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob > [ERROR] > testStagingDirectory(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.021 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.022 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.489 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest) > Time elapsed: 0.023 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.695 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob > [ERROR] testMRJob(org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob) > Time elapsed: 0.039 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.015 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory > [ERROR] > testEverything(org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory) > Time elapsed: 0.014 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > {noformat} > The big issue is that the tests are running in a serial manner - no test is > running on top of the other - so we should not see that the tests are failing > like this. The issue could be in how we handle > org.apache.hadoop.fs.FileSystem#CACHE - the tests should use the same > S3AFileSystem so if A test uses a FileSystem and closes it in teardown then B > test will get the same FileSystem object from the cache and try to use it, > but it is closed. > We see this a lot in our downstream testing too. It's not possible to tell > that the failed regression test result is an implementation issue in the > runtime code or a test implementation problem. > I've checked when and what closes the S3AFileSystem with a sightly modified > version of S3AFileSystem which logs the closers of the fs if an error should > occur. I'll attach this modified java file for reference. See the next > example of the result when it's running: > {noformat} > 2018-10-04 00:52:25,596 [Thread-4201] ERROR s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:checkIfClosed(74)) - Use after close(): > java.lang.RuntimeException: Using closed FS!. > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.checkIfClosed(S3ACloseEnforcedFileSystem.java:73) > at >
[jira] [Commented] (HADOOP-15860) ABFS: Throw IllegalArgumentException when Directory/File name ends with a period(.)
[ https://issues.apache.org/jira/browse/HADOOP-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730004#comment-16730004 ] Sean Mackrory commented on HADOOP-15860: So the issue isn't that path is null, the issue is that path.length == 0. path references a valid String object, but it's trying to do .charAt(-1), which is invalid. I suspect that what's happening is that you're hitting root, at which point it's an empty string. Maybe you can just replace the original null check with "while (!path.isRoot())". path.getParent() is null at that point, but we won't even get that far if we have bounds checking problems. We also need to make sure you can run the ABFS tests. > ABFS: Throw IllegalArgumentException when Directory/File name ends with a > period(.) > --- > > Key: HADOOP-15860 > URL: https://issues.apache.org/jira/browse/HADOOP-15860 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/adl >Affects Versions: 3.2.0 >Reporter: Sean Mackrory >Assignee: Shweta >Priority: Major > Attachments: HADOOP-15860.001.patch, HADOOP-15860.002.patch, > HADOOP-15860.003.patch, HADOOP-15860.004.patch, HADOOP-15860.005.patch, > trailing-periods.patch > > > If you create a directory with a trailing period (e.g. '/test.') the period > is silently dropped, and will be listed as simply '/test'. '/test.test' > appears to work just fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-15819) S3A integration test failures: FileSystem is closed! - without parallel test run
[ https://issues.apache.org/jira/browse/HADOOP-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory resolved HADOOP-15819. Resolution: Fixed > S3A integration test failures: FileSystem is closed! - without parallel test > run > > > Key: HADOOP-15819 > URL: https://issues.apache.org/jira/browse/HADOOP-15819 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 3.1.1 >Reporter: Gabor Bota >Assignee: Adam Antal >Priority: Critical > Attachments: HADOOP-15819.000.patch, HADOOP-15819.001.patch, > HADOOP-15819.002.patch, S3ACloseEnforcedFileSystem.java, > S3ACloseEnforcedFileSystem.java, closed_fs_closers_example_5klines.log.zip > > > Running the integration tests for hadoop-aws {{mvn -Dscale verify}} against > Amazon AWS S3 (eu-west-1, us-west-1, with no s3guard) we see a lot of these > failures: > {noformat} > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.408 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob) > Time elapsed: 0.027 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 4.345 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob > [ERROR] > testStagingDirectory(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.021 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.022 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.489 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest) > Time elapsed: 0.023 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.695 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob > [ERROR] testMRJob(org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob) > Time elapsed: 0.039 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.015 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory > [ERROR] > testEverything(org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory) > Time elapsed: 0.014 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > {noformat} > The big issue is that the tests are running in a serial manner - no test is > running on top of the other - so we should not see that the tests are failing > like this. The issue could be in how we handle > org.apache.hadoop.fs.FileSystem#CACHE - the tests should use the same > S3AFileSystem so if A test uses a FileSystem and closes it in teardown then B > test will get the same FileSystem object from the cache and try to use it, > but it is closed. > We see this a lot in our downstream testing too. It's not possible to tell > that the failed regression test result is an implementation issue in the > runtime code or a test implementation problem. > I've checked when and what closes the S3AFileSystem with a sightly modified > version of S3AFileSystem which logs the closers of the fs if an error should > occur. I'll attach this modified java file for reference. See the next > example of the result when it's running: > {noformat} > 2018-10-04 00:52:25,596 [Thread-4201] ERROR s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:checkIfClosed(74)) - Use after close(): > java.lang.RuntimeException: Using closed FS!. > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.checkIfClosed(S3ACloseEnforcedFileSystem.java:73) > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.mkdirs(S3ACloseEnforcedFileSystem.java:474) > at > org.apache.hadoop.fs.contract.AbstractFSContractTestBase.mkdirs(AbstractFSContractTestBase.java:338) > at > org.apache.hadoop.fs.contract.AbstractFSContractTestBase.setup(AbstractFSContractTestBase.java:193) > at > org.apache.hadoop.fs.s3a.ITestS3AClosedFS.setup(ITestS3AClosedFS.java:40) > at
[jira] [Commented] (HADOOP-15819) S3A integration test failures: FileSystem is closed! - without parallel test run
[ https://issues.apache.org/jira/browse/HADOOP-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729682#comment-16729682 ] Sean Mackrory commented on HADOOP-15819: Committed. Thank you for the excellent work here [~adam.antal]. > S3A integration test failures: FileSystem is closed! - without parallel test > run > > > Key: HADOOP-15819 > URL: https://issues.apache.org/jira/browse/HADOOP-15819 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 3.1.1 >Reporter: Gabor Bota >Assignee: Adam Antal >Priority: Critical > Attachments: HADOOP-15819.000.patch, HADOOP-15819.001.patch, > HADOOP-15819.002.patch, S3ACloseEnforcedFileSystem.java, > S3ACloseEnforcedFileSystem.java, closed_fs_closers_example_5klines.log.zip > > > Running the integration tests for hadoop-aws {{mvn -Dscale verify}} against > Amazon AWS S3 (eu-west-1, us-west-1, with no s3guard) we see a lot of these > failures: > {noformat} > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.408 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob) > Time elapsed: 0.027 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 4.345 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob > [ERROR] > testStagingDirectory(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.021 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.022 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.489 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest) > Time elapsed: 0.023 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.695 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob > [ERROR] testMRJob(org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob) > Time elapsed: 0.039 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.015 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory > [ERROR] > testEverything(org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory) > Time elapsed: 0.014 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > {noformat} > The big issue is that the tests are running in a serial manner - no test is > running on top of the other - so we should not see that the tests are failing > like this. The issue could be in how we handle > org.apache.hadoop.fs.FileSystem#CACHE - the tests should use the same > S3AFileSystem so if A test uses a FileSystem and closes it in teardown then B > test will get the same FileSystem object from the cache and try to use it, > but it is closed. > We see this a lot in our downstream testing too. It's not possible to tell > that the failed regression test result is an implementation issue in the > runtime code or a test implementation problem. > I've checked when and what closes the S3AFileSystem with a sightly modified > version of S3AFileSystem which logs the closers of the fs if an error should > occur. I'll attach this modified java file for reference. See the next > example of the result when it's running: > {noformat} > 2018-10-04 00:52:25,596 [Thread-4201] ERROR s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:checkIfClosed(74)) - Use after close(): > java.lang.RuntimeException: Using closed FS!. > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.checkIfClosed(S3ACloseEnforcedFileSystem.java:73) > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.mkdirs(S3ACloseEnforcedFileSystem.java:474) > at > org.apache.hadoop.fs.contract.AbstractFSContractTestBase.mkdirs(AbstractFSContractTestBase.java:338) > at > org.apache.hadoop.fs.contract.AbstractFSContractTestBase.setup(AbstractFSContractTestBase.java:193) > at >
[jira] [Commented] (HADOOP-15860) ABFS: Throw IllegalArgumentException when Directory/File name ends with a period(.)
[ https://issues.apache.org/jira/browse/HADOOP-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729681#comment-16729681 ] Sean Mackrory commented on HADOOP-15860: Cancel that - many tests fail. Need to add a check that the string isn't empty. {code} java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.charAt(String.java:658) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.trailingPeriodCheck(AzureBlobFileSystem.java:395) {code} > ABFS: Throw IllegalArgumentException when Directory/File name ends with a > period(.) > --- > > Key: HADOOP-15860 > URL: https://issues.apache.org/jira/browse/HADOOP-15860 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/adl >Affects Versions: 3.2.0 >Reporter: Sean Mackrory >Assignee: Shweta >Priority: Major > Attachments: HADOOP-15860.001.patch, HADOOP-15860.002.patch, > HADOOP-15860.003.patch, HADOOP-15860.004.patch, trailing-periods.patch > > > If you create a directory with a trailing period (e.g. '/test.') the period > is silently dropped, and will be listed as simply '/test'. '/test.test' > appears to work just fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15860) ABFS: Throw IllegalArgumentException when Directory/File name ends with a period(.)
[ https://issues.apache.org/jira/browse/HADOOP-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729291#comment-16729291 ] Sean Mackrory commented on HADOOP-15860: +1, will commit. > ABFS: Throw IllegalArgumentException when Directory/File name ends with a > period(.) > --- > > Key: HADOOP-15860 > URL: https://issues.apache.org/jira/browse/HADOOP-15860 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/adl >Affects Versions: 3.2.0 >Reporter: Sean Mackrory >Assignee: Shweta >Priority: Major > Attachments: HADOOP-15860.001.patch, HADOOP-15860.002.patch, > HADOOP-15860.003.patch, HADOOP-15860.004.patch, trailing-periods.patch > > > If you create a directory with a trailing period (e.g. '/test.') the period > is silently dropped, and will be listed as simply '/test'. '/test.test' > appears to work just fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15819) S3A integration test failures: FileSystem is closed! - without parallel test run
[ https://issues.apache.org/jira/browse/HADOOP-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725544#comment-16725544 ] Sean Mackrory commented on HADOOP-15819: Ah - FS.key not being set is the piece I was missing. I suppose the truly correct solution here would be to make sure FS.key is set properly, but I don't know that the benefit to that is worth a whole lot of your time. I'll hold off for a day in case [~ste...@apache.org] who wrote this in the first place disagrees. Otherwise +1 and I'll commit soon. As a side note, I noticed StagingTestBase has a call to the same function, but obviously isn't causing the same issue. > S3A integration test failures: FileSystem is closed! - without parallel test > run > > > Key: HADOOP-15819 > URL: https://issues.apache.org/jira/browse/HADOOP-15819 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 3.1.1 >Reporter: Gabor Bota >Assignee: Adam Antal >Priority: Critical > Attachments: HADOOP-15819.000.patch, HADOOP-15819.001.patch, > HADOOP-15819.002.patch, S3ACloseEnforcedFileSystem.java, > S3ACloseEnforcedFileSystem.java, closed_fs_closers_example_5klines.log.zip > > > Running the integration tests for hadoop-aws {{mvn -Dscale verify}} against > Amazon AWS S3 (eu-west-1, us-west-1, with no s3guard) we see a lot of these > failures: > {noformat} > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.408 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob) > Time elapsed: 0.027 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 4.345 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob > [ERROR] > testStagingDirectory(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.021 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.022 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.489 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest) > Time elapsed: 0.023 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.695 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob > [ERROR] testMRJob(org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob) > Time elapsed: 0.039 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.015 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory > [ERROR] > testEverything(org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory) > Time elapsed: 0.014 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > {noformat} > The big issue is that the tests are running in a serial manner - no test is > running on top of the other - so we should not see that the tests are failing > like this. The issue could be in how we handle > org.apache.hadoop.fs.FileSystem#CACHE - the tests should use the same > S3AFileSystem so if A test uses a FileSystem and closes it in teardown then B > test will get the same FileSystem object from the cache and try to use it, > but it is closed. > We see this a lot in our downstream testing too. It's not possible to tell > that the failed regression test result is an implementation issue in the > runtime code or a test implementation problem. > I've checked when and what closes the S3AFileSystem with a sightly modified > version of S3AFileSystem which logs the closers of the fs if an error should > occur. I'll attach this modified java file for reference. See the next > example of the result when it's running: > {noformat} > 2018-10-04 00:52:25,596 [Thread-4201] ERROR s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:checkIfClosed(74)) - Use after close(): > java.lang.RuntimeException: Using closed FS!. > at >
[jira] [Comment Edited] (HADOOP-15860) ABFS: Throw IllegalArgumentException when Directory/File name ends with a period(.)
[ https://issues.apache.org/jira/browse/HADOOP-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725538#comment-16725538 ] Sean Mackrory edited comment on HADOOP-15860 at 12/20/18 2:59 AM: -- So there's a checkstyle issue to fix (space before the '(') and a number of other minor whitespace issues like inconsistent spacing before '{'s. Can you also revisit the exception messages? Currently they're not entirely correct. For example, I'd make the following changes: * "Cannot create a Path with names that end with a dot" -> "ABFS does not allow files or directories to end with a dot" * "Path does not have a trailing period." -> "Attempt to create file that ended with a dot should throw IllegalArgumentException" Also, it occurred to me that mkdirs is recursive. Someone could pass "/dir1/dir2./dir3/" when none of them exist and we wouldn't stop them. In mkdirs we should check each .parent() up to root. When playing around with these tests to see what happened in various cases I got a couple of exceptions for files already existing. There's 2 reasons for that: * It might be that one test run was hitting data from the previous run. Let's replace our use of "new Path()" with "path()". It's a helper function in the parent classes that gives you a path inside a randomized subdirectory to help prevent this. All tests should actually do this, but let's just fix the new ones for now. * Let's add a comment about the actual problem here. (a) Microsoft says you shouldn't do this, and (b) if you do it anyway, /this./path. and /this/path are treated as identical in some cases. This is potentially problematic if anyone messes with the tests since the 2 paths in each of your tests would be equals. So let's have a clear comment about this and I'll elaborate in the commit message too. was (Author: mackrorysd): So there's a checkstyle issue to fix (space before the '(') and a number of other minor whitespace issues like inconsistent spacing before '{'s. Can you also revisit the exception messages? Currently they're not entirely correct. For example, I'd make the following changes: * "Cannot create a Path with names that end with a dot" -> "ABFS does not allow files or directories to end with a dot" * "Path does not have a trailing period." -> "Attempt to create file that ended with a dot should throw IllegalArgumentException" Also, it occurred to me that mkdirs is recursive. Someone could pass "/dir1/dir2./dir3/" when none of them exist and we wouldn't stop them. When playing around with these tests to see what happened in various cases I got a couple of exceptions for files already existing. There's 2 reasons for that: * It might be that one test run was hitting data from the previous run. Let's replace our use of "new Path()" with "path()". It's a helper function in the parent classes that gives you a path inside a randomized subdirectory to help prevent this. All tests should actually do this, but let's just fix the new ones for now. * Let's add a comment about the actual problem here. (a) Microsoft says you shouldn't do this, and (b) if you do it anyway, /this./path. and /this/path are treated as identical in some cases. This is potentially problematic if anyone messes with the tests since the 2 paths in each of your tests would be equals. So let's have a clear comment about this and I'll elaborate in the commit message too. > ABFS: Throw IllegalArgumentException when Directory/File name ends with a > period(.) > --- > > Key: HADOOP-15860 > URL: https://issues.apache.org/jira/browse/HADOOP-15860 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/adl >Affects Versions: 3.2.0 >Reporter: Sean Mackrory >Assignee: Shweta >Priority: Major > Attachments: HADOOP-15860.001.patch, HADOOP-15860.002.patch, > trailing-periods.patch > > > If you create a directory with a trailing period (e.g. '/test.') the period > is silently dropped, and will be listed as simply '/test'. '/test.test' > appears to work just fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15860) ABFS: Throw IllegalArgumentException when Directory/File name ends with a period(.)
[ https://issues.apache.org/jira/browse/HADOOP-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725538#comment-16725538 ] Sean Mackrory commented on HADOOP-15860: So there's a checkstyle issue to fix (space before the '(') and a number of other minor whitespace issues like inconsistent spacing before '{'s. Can you also revisit the exception messages? Currently they're not entirely correct. For example, I'd make the following changes: * "Cannot create a Path with names that end with a dot" -> "ABFS does not allow files or directories to end with a dot" * "Path does not have a trailing period." -> "Attempt to create file that ended with a dot should throw IllegalArgumentException" Also, it occurred to me that mkdirs is recursive. Someone could pass "/dir1/dir2./dir3/" when none of them exist and we wouldn't stop them. When playing around with these tests to see what happened in various cases I got a couple of exceptions for files already existing. There's 2 reasons for that: * It might be that one test run was hitting data from the previous run. Let's replace our use of "new Path()" with "path()". It's a helper function in the parent classes that gives you a path inside a randomized subdirectory to help prevent this. All tests should actually do this, but let's just fix the new ones for now. * Let's add a comment about the actual problem here. (a) Microsoft says you shouldn't do this, and (b) if you do it anyway, /this./path. and /this/path are treated as identical in some cases. This is potentially problematic if anyone messes with the tests since the 2 paths in each of your tests would be equals. So let's have a clear comment about this and I'll elaborate in the commit message too. > ABFS: Throw IllegalArgumentException when Directory/File name ends with a > period(.) > --- > > Key: HADOOP-15860 > URL: https://issues.apache.org/jira/browse/HADOOP-15860 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/adl >Affects Versions: 3.2.0 >Reporter: Sean Mackrory >Assignee: Shweta >Priority: Major > Attachments: HADOOP-15860.001.patch, HADOOP-15860.002.patch, > trailing-periods.patch > > > If you create a directory with a trailing period (e.g. '/test.') the period > is silently dropped, and will be listed as simply '/test'. '/test.test' > appears to work just fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15819) S3A integration test failures: FileSystem is closed! - without parallel test run
[ https://issues.apache.org/jira/browse/HADOOP-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724301#comment-16724301 ] Sean Mackrory commented on HADOOP-15819: The only failure I see with this patch is the bouncycastle one. I don't have an objection to removing this - it's only for a slight perf improvement on tests. I'm curious to know why this is resulting an a distinct cache entry. Is the URL slightly different at this point than it is in FS.newInstance() or something? > S3A integration test failures: FileSystem is closed! - without parallel test > run > > > Key: HADOOP-15819 > URL: https://issues.apache.org/jira/browse/HADOOP-15819 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 3.1.1 >Reporter: Gabor Bota >Assignee: Adam Antal >Priority: Critical > Attachments: HADOOP-15819.000.patch, HADOOP-15819.001.patch, > HADOOP-15819.002.patch, S3ACloseEnforcedFileSystem.java, > S3ACloseEnforcedFileSystem.java, closed_fs_closers_example_5klines.log.zip > > > Running the integration tests for hadoop-aws {{mvn -Dscale verify}} against > Amazon AWS S3 (eu-west-1, us-west-1, with no s3guard) we see a lot of these > failures: > {noformat} > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.408 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITDirectoryCommitMRJob) > Time elapsed: 0.027 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 4.345 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob > [ERROR] > testStagingDirectory(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.021 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJob) > Time elapsed: 0.022 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.489 > s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest > [ERROR] > testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITStagingCommitMRJobBadDest) > Time elapsed: 0.023 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.695 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob > [ERROR] testMRJob(org.apache.hadoop.fs.s3a.commit.magic.ITMagicCommitMRJob) > Time elapsed: 0.039 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.015 > s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory > [ERROR] > testEverything(org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory) > Time elapsed: 0.014 s <<< ERROR! > java.io.IOException: s3a://cloudera-dev-gabor-ireland: FileSystem is closed! > {noformat} > The big issue is that the tests are running in a serial manner - no test is > running on top of the other - so we should not see that the tests are failing > like this. The issue could be in how we handle > org.apache.hadoop.fs.FileSystem#CACHE - the tests should use the same > S3AFileSystem so if A test uses a FileSystem and closes it in teardown then B > test will get the same FileSystem object from the cache and try to use it, > but it is closed. > We see this a lot in our downstream testing too. It's not possible to tell > that the failed regression test result is an implementation issue in the > runtime code or a test implementation problem. > I've checked when and what closes the S3AFileSystem with a sightly modified > version of S3AFileSystem which logs the closers of the fs if an error should > occur. I'll attach this modified java file for reference. See the next > example of the result when it's running: > {noformat} > 2018-10-04 00:52:25,596 [Thread-4201] ERROR s3a.S3ACloseEnforcedFileSystem > (S3ACloseEnforcedFileSystem.java:checkIfClosed(74)) - Use after close(): > java.lang.RuntimeException: Using closed FS!. > at > org.apache.hadoop.fs.s3a.S3ACloseEnforcedFileSystem.checkIfClosed(S3ACloseEnforcedFileSystem.java:73) > at >
[jira] [Comment Edited] (HADOOP-15860) ABFS: Throw IllegalArgumentException when Directory/File name ends with a period(.)
[ https://issues.apache.org/jira/browse/HADOOP-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721703#comment-16721703 ] Sean Mackrory edited comment on HADOOP-15860 at 12/14/18 7:30 PM: -- Also instead of just catching the exceptions, we should ensure we fail if an exception is thrown. e.g. catch block changes a flag from false to true, then we assert that it's true. edit: Just to add to my previous comments, ITestAbfsFileSystemContractMkdir>AbstractContractMkdirTest.testMkdirSlashHandling fails with this patch. Make sure you run all the ABFS tests to check for regressions. I did some playing with these same APIs trying to make a file that ended with a slash. It does actually fail explicitly on the server-side (instead of silently losing the period like happens with trailing periods), and it is blocked by other code or otherwise valid (like when specifying a destination directory). So I think we can simply ignore trailing slashes. was (Author: mackrorysd): Also instead of just catching the exceptions, we should ensure we fail if an exception is thrown. e.g. catch block changes a flag from false to true, then we assert that it's true. > ABFS: Throw IllegalArgumentException when Directory/File name ends with a > period(.) > --- > > Key: HADOOP-15860 > URL: https://issues.apache.org/jira/browse/HADOOP-15860 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/adl >Affects Versions: 3.2.0 >Reporter: Sean Mackrory >Assignee: Shweta >Priority: Major > Attachments: HADOOP-15860.001.patch, trailing-periods.patch > > > If you create a directory with a trailing period (e.g. '/test.') the period > is silently dropped, and will be listed as simply '/test'. '/test.test' > appears to work just fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15860) ABFS: Throw IllegalArgumentException when Directory/File name ends with a period(.)
[ https://issues.apache.org/jira/browse/HADOOP-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721703#comment-16721703 ] Sean Mackrory commented on HADOOP-15860: Also instead of just catching the exceptions, we should ensure we fail if an exception is thrown. e.g. catch block changes a flag from false to true, then we assert that it's true. > ABFS: Throw IllegalArgumentException when Directory/File name ends with a > period(.) > --- > > Key: HADOOP-15860 > URL: https://issues.apache.org/jira/browse/HADOOP-15860 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/adl >Affects Versions: 3.2.0 >Reporter: Sean Mackrory >Assignee: Shweta >Priority: Major > Attachments: HADOOP-15860.001.patch, trailing-periods.patch > > > If you create a directory with a trailing period (e.g. '/test.') the period > is silently dropped, and will be listed as simply '/test'. '/test.test' > appears to work just fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15860) ABFS: Throw IllegalArgumentException when Directory/File name ends with a period(.)
[ https://issues.apache.org/jira/browse/HADOOP-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720899#comment-16720899 ] Sean Mackrory commented on HADOOP-15860: Thanks for the patch [~shwetayakkali]. For the most part it looks good, but I think we need to consider the trailing slash a little harder. A destination path absolutely can end in a trailing slash (e.g. mv /some-file /some-existing-directory/). On the other hand, I'm actually not sure how or if one can create a file that ends with a trailing slash. I suspect that the Path class already handles the trailing slash as much as we need it to, and we should just check for trailing periods. But we should dig and confirm that the edge cases here are working correctly. > ABFS: Throw IllegalArgumentException when Directory/File name ends with a > period(.) > --- > > Key: HADOOP-15860 > URL: https://issues.apache.org/jira/browse/HADOOP-15860 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/adl >Affects Versions: 3.2.0 >Reporter: Sean Mackrory >Assignee: Shweta >Priority: Major > Attachments: HADOOP-15860.001.patch, trailing-periods.patch > > > If you create a directory with a trailing period (e.g. '/test.') the period > is silently dropped, and will be listed as simply '/test'. '/test.test' > appears to work just fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15999) [s3a] Better support for out-of-band operations
[ https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719468#comment-16719468 ] Sean Mackrory edited comment on HADOOP-15999 at 12/12/18 9:35 PM: -- {quote}We might just want to make that configurable (separate config knob probably). If we are in "check both MS and S3" mode, we probably want a configurable or pluggable conflict policy.{quote} Yeah - I also considered addressing the out-of-band deletes problem with a config (or 2) that governs whether we create and / or honor tombstones. But that's adding exposed complexity and isn't very elegant. If we can relatively easily just start comparing modification times, then we can fix all these use cases and offer 2 basic modes: - S3Guard with authoritative mode, in which the MetadataStore is the source of truth and we can assume All The Things. - S3Guard without authoritative mode, in which S3 is the source of truth. We will always be at least as up to date as S3 appears, and will fix list consistency as long as S3 doesn't give us evidence to the contrary (i.e. older modification times or the lack of an update entirely). I feel very uncomfortable with the idea of some middle ground where S3Guard can't be the source of truth, but we're still trying to be in some cases. It either has all the context or it doesn't, and if it doesn't we're trading in correctness for some performance, which I think is the wrong trade-off. was (Author: mackrorysd): {quote}We might just want to make that configurable (separate config knob probably). If we are in "check both MS and S3" mode, we probably want a configurable or pluggable conflict policy.{quote} Yeah - I also considered addressing the out-of-band deletes problem with a config (or 2) that governs whether we create and / or honor tombstones. But that's adding exposed complexity and isn't very elegant. If we can relative easily just start comparing modification times, then we can offer 2 basic modes: - S3Guard with authoritative mode, in which the MetadataStore is the source of truth and we can assume All The Things. - S3Guard without authoritative mode, in which S3 is the source of truth. We will always be at least as up to date as S3 appears, and will fix list consistency as long as S3 doesn't give us evidence to the contrary (i.e. older modification times or the lack of an update entirely). > [s3a] Better support for out-of-band operations > --- > > Key: HADOOP-15999 > URL: https://issues.apache.org/jira/browse/HADOOP-15999 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Sean Mackrory >Assignee: Gabor Bota >Priority: Major > Attachments: out-of-band-operations.patch > > > S3Guard was initially done on the premise that a new MetadataStore would be > the source of truth, and that it wouldn't provide guarantees if updates were > done without using S3Guard. > I've been seeing increased demand for better support for scenarios where > operations are done on the data that can't reasonably be done with S3Guard > involved. For example: > * A file is deleted using S3Guard, and replaced by some other tool. S3Guard > can't tell the difference between the new file and delete / list > inconsistency and continues to treat the file as deleted. > * An S3Guard-ed file is overwritten by a longer file by some other tool. When > reading the file, only the length of the original file is read. > We could possibly have smarter behavior here by querying both S3 and the > MetadataStore (even in cases where we may currently only query the > MetadataStore in getFileStatus) and use whichever one has the higher modified > time. > This kills the performance boost we currently get in some workloads with the > short-circuited getFileStatus, but we could keep it with authoritative mode > which should give a larger performance boost. At least we'd get more > correctness without authoritative mode and a clear declaration of when we can > make the assumptions required to short-circuit the process. If we can't > consider S3Guard the source of truth, we need to defer to S3 more. > We'd need to be extra sure of any locality / time zone issues if we start > relying on mod_time more directly, but currently we're tracking the > modification time as returned by S3 anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15999) [s3a] Better support for out-of-band operations
[ https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719468#comment-16719468 ] Sean Mackrory commented on HADOOP-15999: {quote}We might just want to make that configurable (separate config knob probably). If we are in "check both MS and S3" mode, we probably want a configurable or pluggable conflict policy.{quote} Yeah - I also considered addressing the out-of-band deletes problem with a config (or 2) that governs whether we create and / or honor tombstones. But that's adding exposed complexity and isn't very elegant. If we can relative easily just start comparing modification times, then we can offer 2 basic modes: - S3Guard with authoritative mode, in which the MetadataStore is the source of truth and we can assume All The Things. - S3Guard without authoritative mode, in which S3 is the source of truth. We will always be at least as up to date as S3 appears, and will fix list consistency as long as S3 doesn't give us evidence to the contrary (i.e. older modification times or the lack of an update entirely). > [s3a] Better support for out-of-band operations > --- > > Key: HADOOP-15999 > URL: https://issues.apache.org/jira/browse/HADOOP-15999 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Sean Mackrory >Assignee: Gabor Bota >Priority: Major > Attachments: out-of-band-operations.patch > > > S3Guard was initially done on the premise that a new MetadataStore would be > the source of truth, and that it wouldn't provide guarantees if updates were > done without using S3Guard. > I've been seeing increased demand for better support for scenarios where > operations are done on the data that can't reasonably be done with S3Guard > involved. For example: > * A file is deleted using S3Guard, and replaced by some other tool. S3Guard > can't tell the difference between the new file and delete / list > inconsistency and continues to treat the file as deleted. > * An S3Guard-ed file is overwritten by a longer file by some other tool. When > reading the file, only the length of the original file is read. > We could possibly have smarter behavior here by querying both S3 and the > MetadataStore (even in cases where we may currently only query the > MetadataStore in getFileStatus) and use whichever one has the higher modified > time. > This kills the performance boost we currently get in some workloads with the > short-circuited getFileStatus, but we could keep it with authoritative mode > which should give a larger performance boost. At least we'd get more > correctness without authoritative mode and a clear declaration of when we can > make the assumptions required to short-circuit the process. If we can't > consider S3Guard the source of truth, we need to defer to S3 more. > We'd need to be extra sure of any locality / time zone issues if we start > relying on mod_time more directly, but currently we're tracking the > modification time as returned by S3 anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15988) Should be able to set empty directory flag to TRUE in DynamoDBMetadataStore#innerGet when using authoritative directory listings
[ https://issues.apache.org/jira/browse/HADOOP-15988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15988: --- Resolution: Fixed Fix Version/s: 3.3.0 Status: Resolved (was: Patch Available) Looks good to me. I did remove the iff -> if change you made between .001. and .002. as I believe that's intentionally (if-and-only-if). > Should be able to set empty directory flag to TRUE in > DynamoDBMetadataStore#innerGet when using authoritative directory listings > > > Key: HADOOP-15988 > URL: https://issues.apache.org/jira/browse/HADOOP-15988 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-15988.001.patch, HADOOP-15988.002.patch > > > We have the following comment and implementation in DynamoDBMetadataStore: > {noformat} > // When this class has support for authoritative > // (fully-cached) directory listings, we may also be able to answer > // TRUE here. Until then, we don't know if we have full listing or > // not, thus the UNKNOWN here: > meta.setIsEmptyDirectory( > hasChildren ? Tristate.FALSE : Tristate.UNKNOWN); > {noformat} > We have authoritative listings now in dynamo since HADOOP-15621, so we should > resolve this comment, implement the solution and test it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15428) s3guard bucket-info will create s3guard table if FS is set to do this automatically
[ https://issues.apache.org/jira/browse/HADOOP-15428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719117#comment-16719117 ] Sean Mackrory commented on HADOOP-15428: Yeah that's more direct. Tweaked it a bit and updated. I'm happy with it, so I'll resolve. Feel free to discuss if there's more feedback... > s3guard bucket-info will create s3guard table if FS is set to do this > automatically > --- > > Key: HADOOP-15428 > URL: https://issues.apache.org/jira/browse/HADOOP-15428 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Steve Loughran >Assignee: Gabor Bota >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-15428.001.patch > > > If you call hadoop s3guard bucket-info on a bucket where the fs is set to > create a s3guard table on demand, then the DDB table is automatically > created. As a result > the {{bucket-info -unguarded}} option cannot be used, and the call has > significant side effects (i.e. it can run up bills) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15428) s3guard bucket-info will create s3guard table if FS is set to do this automatically
[ https://issues.apache.org/jira/browse/HADOOP-15428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15428: --- Resolution: Fixed Status: Resolved (was: Patch Available) > s3guard bucket-info will create s3guard table if FS is set to do this > automatically > --- > > Key: HADOOP-15428 > URL: https://issues.apache.org/jira/browse/HADOOP-15428 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Steve Loughran >Assignee: Gabor Bota >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-15428.001.patch > > > If you call hadoop s3guard bucket-info on a bucket where the fs is set to > create a s3guard table on demand, then the DDB table is automatically > created. As a result > the {{bucket-info -unguarded}} option cannot be used, and the call has > significant side effects (i.e. it can run up bills) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15428) s3guard bucket-info will create s3guard table if FS is set to do this automatically
[ https://issues.apache.org/jira/browse/HADOOP-15428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15428: --- Release Note: If -unguarded flag is passed to `hadoop s3guard bucket-info`, it will now proceed with S3Guard disabled instead of failing if S3Guard is not already disabled. (was: The -unguarded flag, passed to `hadoop s3guard bucket-info` will no proceed with S3Guard disabled instead of failing if S3Guard is not already disabled.) > s3guard bucket-info will create s3guard table if FS is set to do this > automatically > --- > > Key: HADOOP-15428 > URL: https://issues.apache.org/jira/browse/HADOOP-15428 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Steve Loughran >Assignee: Gabor Bota >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-15428.001.patch > > > If you call hadoop s3guard bucket-info on a bucket where the fs is set to > create a s3guard table on demand, then the DDB table is automatically > created. As a result > the {{bucket-info -unguarded}} option cannot be used, and the call has > significant side effects (i.e. it can run up bills) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15999) [s3a] Better support for out-of-band operations
[ https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15999: --- Attachment: out-of-band-operations.patch > [s3a] Better support for out-of-band operations > --- > > Key: HADOOP-15999 > URL: https://issues.apache.org/jira/browse/HADOOP-15999 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Sean Mackrory >Priority: Major > Attachments: out-of-band-operations.patch > > > S3Guard was initially done on the premise that a new MetadataStore would be > the source of truth, and that it wouldn't provide guarantees if updates were > done without using S3Guard. > I've been seeing increased demand for better support for scenarios where > operations are done on the data that can't reasonably be done with S3Guard > involved. For example: > * A file is deleted using S3Guard, and replaced by some other tool. S3Guard > can't tell the difference between the new file and delete / list > inconsistency and continues to treat the file as deleted. > * An S3Guard-ed file is overwritten by a longer file by some other tool. When > reading the file, only the length of the original file is read. > We could possibly have smarter behavior here by querying both S3 and the > MetadataStore (even in cases where we may currently only query the > MetadataStore in getFileStatus) and use whichever one has the higher modified > time. > This kills the performance boost we currently get in some workloads with the > short-circuited getFileStatus, but we could keep it with authoritative mode > which should give a larger performance boost. At least we'd get more > correctness without authoritative mode and a clear declaration of when we can > make the assumptions required to short-circuit the process. If we can't > consider S3Guard the source of truth, we need to defer to S3 more. > We'd need to be extra sure of any locality / time zone issues if we start > relying on mod_time more directly, but currently we're tracking the > modification time as returned by S3 anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-15841) ABFS: change createRemoteFileSystemDuringInitialization default to true
[ https://issues.apache.org/jira/browse/HADOOP-15841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory resolved HADOOP-15841. Resolution: Won't Fix > ABFS: change createRemoteFileSystemDuringInitialization default to true > --- > > Key: HADOOP-15841 > URL: https://issues.apache.org/jira/browse/HADOOP-15841 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > > I haven't seen a way to create a working container (at least for the dfs > endpoint) except for setting > fs.azure.createRemoteFileSystemDuringInitialization=true. I personally don't > see that much of a downside to having it default to true, and it's a mild > inconvenience to remember to set it to true for some action to create a > container. I vaguely recall [~tmarquardt] considering changing this default > too. > I propose we do it? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15999) [s3a] Better support for out-of-band operations
[ https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718241#comment-16718241 ] Sean Mackrory commented on HADOOP-15999: Attaching a test that illustrates some of the nuances of the issue when overwriting files. The test of deleted files doesn't fail with S3Guard, which surprises me, as I know this same basic scenario has caused some problems. Need to dig deeper into why this doesn't reproduce that behavior at some point... > [s3a] Better support for out-of-band operations > --- > > Key: HADOOP-15999 > URL: https://issues.apache.org/jira/browse/HADOOP-15999 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Sean Mackrory >Priority: Major > Attachments: out-of-band-operations.patch > > > S3Guard was initially done on the premise that a new MetadataStore would be > the source of truth, and that it wouldn't provide guarantees if updates were > done without using S3Guard. > I've been seeing increased demand for better support for scenarios where > operations are done on the data that can't reasonably be done with S3Guard > involved. For example: > * A file is deleted using S3Guard, and replaced by some other tool. S3Guard > can't tell the difference between the new file and delete / list > inconsistency and continues to treat the file as deleted. > * An S3Guard-ed file is overwritten by a longer file by some other tool. When > reading the file, only the length of the original file is read. > We could possibly have smarter behavior here by querying both S3 and the > MetadataStore (even in cases where we may currently only query the > MetadataStore in getFileStatus) and use whichever one has the higher modified > time. > This kills the performance boost we currently get in some workloads with the > short-circuited getFileStatus, but we could keep it with authoritative mode > which should give a larger performance boost. At least we'd get more > correctness without authoritative mode and a clear declaration of when we can > make the assumptions required to short-circuit the process. If we can't > consider S3Guard the source of truth, we need to defer to S3 more. > We'd need to be extra sure of any locality / time zone issues if we start > relying on mod_time more directly, but currently we're tracking the > modification time as returned by S3 anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15999) [s3a] Better support for out-of-band operations
Sean Mackrory created HADOOP-15999: -- Summary: [s3a] Better support for out-of-band operations Key: HADOOP-15999 URL: https://issues.apache.org/jira/browse/HADOOP-15999 Project: Hadoop Common Issue Type: New Feature Reporter: Sean Mackrory S3Guard was initially done on the premise that a new MetadataStore would be the source of truth, and that it wouldn't provide guarantees if updates were done without using S3Guard. I've been seeing increased demand for better support for scenarios where operations are done on the data that can't reasonably be done with S3Guard involved. For example: * A file is deleted using S3Guard, and replaced by some other tool. S3Guard can't tell the difference between the new file and delete / list inconsistency and continues to treat the file as deleted. * An S3Guard-ed file is overwritten by a longer file by some other tool. When reading the file, only the length of the original file is read. We could possibly have smarter behavior here by querying both S3 and the MetadataStore (even in cases where we may currently only query the MetadataStore in getFileStatus) and use whichever one has the higher modified time. This kills the performance boost we currently get in some workloads with the short-circuited getFileStatus, but we could keep it with authoritative mode which should give a larger performance boost. At least we'd get more correctness without authoritative mode and a clear declaration of when we can make the assumptions required to short-circuit the process. If we can't consider S3Guard the source of truth, we need to defer to S3 more. We'd need to be extra sure of any locality / time zone issues if we start relying on mod_time more directly, but currently we're tracking the modification time as returned by S3 anyway. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15845) s3guard init and destroy command will create/destroy tables if ddb.table & region are set
[ https://issues.apache.org/jira/browse/HADOOP-15845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15845: --- Resolution: Fixed Status: Resolved (was: Patch Available) > s3guard init and destroy command will create/destroy tables if ddb.table & > region are set > - > > Key: HADOOP-15845 > URL: https://issues.apache.org/jira/browse/HADOOP-15845 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.1 >Reporter: Steve Loughran >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-15845.001.patch > > > If you have s3guard set up with a table name and a region, then s3guard init > will automatically create the table, without you specifying a bucket or URI. > I had expected the command just to print out its arguments, but it actually > did the init with the default bucket values > Even worse, `hadoop s3guard destroy` will destroy the table. > This is too dangerous to allow. The command must require either the name of a > bucket or an an explicit ddb table URI -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15987) ITestDynamoDBMetadataStore should check if test ddb table set properly before initializing the test
[ https://issues.apache.org/jira/browse/HADOOP-15987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15987: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed. Not worth obsessing too much over the wording of an exception that only occurs in tests. The update to the documentation is sufficiently clear. > ITestDynamoDBMetadataStore should check if test ddb table set properly before > initializing the test > --- > > Key: HADOOP-15987 > URL: https://issues.apache.org/jira/browse/HADOOP-15987 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-15987.001.patch > > > The jira covers the following: > * We should assert that the table name is configured when > DynamoDBMetadataStore is used for testing, so the test should fail if it's > not configured. > * We should assert that the test table is not the same as the production > table, as the test table could be modified and destroyed multiple times > during the test. > * This behavior should be added to the testing docs. > [Assume from junit > doc|http://junit.sourceforge.net/javadoc/org/junit/Assume.html]: > {noformat} > A set of methods useful for stating assumptions about the conditions in which > a test is meaningful. > A failed assumption does not mean the code is broken, but that the test > provides no useful information. > The default JUnit runner treats tests with failing assumptions as ignored. > {noformat} > A failed assert will cause test failure instead of just skipping the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15845) s3guard init and destroy command will create/destroy tables if ddb.table & region are set
[ https://issues.apache.org/jira/browse/HADOOP-15845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15845: --- Release Note: `hadoop s3guard destroy` and `hadoop s3guard init` now require a MetadataStore URI or a S3A URI to be explicitly provided on the command-line to perform the action. +1 - looks good to me and tests pass. > s3guard init and destroy command will create/destroy tables if ddb.table & > region are set > - > > Key: HADOOP-15845 > URL: https://issues.apache.org/jira/browse/HADOOP-15845 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.1 >Reporter: Steve Loughran >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-15845.001.patch > > > If you have s3guard set up with a table name and a region, then s3guard init > will automatically create the table, without you specifying a bucket or URI. > I had expected the command just to print out its arguments, but it actually > did the init with the default bucket values > Even worse, `hadoop s3guard destroy` will destroy the table. > This is too dangerous to allow. The command must require either the name of a > bucket or an an explicit ddb table URI -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15988) Should be able to set empty directory flag to TRUE in DynamoDBMetadataStore#innerGet when using authoritative directory listings
[ https://issues.apache.org/jira/browse/HADOOP-15988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715805#comment-16715805 ] Sean Mackrory commented on HADOOP-15988: Good stuff. A few nits: * Can you clean up the checkstyle issues that were raised? * Most of the code between the 2 tests is shared. Can we refactor that into a single test that just tests the same sequence with a different auth value and outcome? If that turns out to be messy for some reason it's not a deal breaker, but worth a couple of minutes if that's all it takes. +1 otherwise. > Should be able to set empty directory flag to TRUE in > DynamoDBMetadataStore#innerGet when using authoritative directory listings > > > Key: HADOOP-15988 > URL: https://issues.apache.org/jira/browse/HADOOP-15988 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > Attachments: HADOOP-15988.001.patch > > > We have the following comment and implementation in DynamoDBMetadataStore: > {noformat} > // When this class has support for authoritative > // (fully-cached) directory listings, we may also be able to answer > // TRUE here. Until then, we don't know if we have full listing or > // not, thus the UNKNOWN here: > meta.setIsEmptyDirectory( > hasChildren ? Tristate.FALSE : Tristate.UNKNOWN); > {noformat} > We have authoritative listings now in dynamo since HADOOP-15621, so we should > resolve this comment, implement the solution and test it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org