[jira] [Commented] (HADOOP-15696) KMS performance regression due to too many open file descriptors after Jetty migration

2018-08-29 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597076#comment-16597076
 ] 

Xiao Chen commented on HADOOP-15696:


Thanks for the work here [~jojochuang] and [~mi...@cloudera.com], good find and 
discussions!

Everything you said make sense. I'm not entirely sure, but I think originally 
with tomcat, this way of connection setup didn't cause any issue probably 
because the connection was polled by tomcat. This has proven to be problematic 
in Jetty, and glad having the idle timeout dropped helped this situation.

How about we do this jira first to make it configurable, and continue the 
connection reuse improvement as a follow-on? A tricky part for the connection 
reuse is (at least on current trunk) the connection object is created in 
relation to the caller ugi, so one would need to make sure the reuse doesn't 
expose any security concerns. From use case perspective, I agree the connection 
should be reused - and if not, the connection to be proactively closed by the 
client.

Minor code review comment on patch 1:
- Instead of calling it {{HTTP_IDLE_TIMEOUT}} we may call it 
{{HTTP_IDLE_TIMEOUT_MS}} for clarity about the millisecond unit
- It seems httpfs-default could also use such an update.

> KMS performance regression due to too many open file descriptors after Jetty 
> migration
> --
>
> Key: HADOOP-15696
> URL: https://issues.apache.org/jira/browse/HADOOP-15696
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 3.0.0-alpha2
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Blocker
> Attachments: HADOOP-15696.001.patch, Screen Shot 2018-08-22 at 
> 11.36.16 AM.png, Screen Shot 2018-08-22 at 4.26.51 PM.png, Screen Shot 
> 2018-08-22 at 4.26.51 PM.png, Screen Shot 2018-08-22 at 4.27.02 PM.png, 
> Screen Shot 2018-08-22 at 4.30.32 PM.png, Screen Shot 2018-08-22 at 4.30.39 
> PM.png, Screen Shot 2018-08-24 at 7.08.16 PM.png
>
>
> We recently found KMS performance regressed in Hadoop 3.0, possibly linking 
> to the migration from Tomcat to Jetty in HADOOP-13597.
> Symptoms:
> # Hadoop 3.x KMS open file descriptors quickly rises to more than 10 thousand 
> under stress, sometimes even exceeds 32K, which is the system limit, causing 
> failures for any access to encryption zones. Our internal testing shows the 
> openfd number was in the range of a few hundred in Hadoop 2.x, and it 
> increases by almost 100x in Hadoop 3.
> # Hadoop 3.x KMS as much as twice the heap size than in Hadoop 2.x. The same 
> heap size can go OOM in Hadoop 3.x. Jxray analysis suggests most of them are 
> temporary byte arrays associated with open SSL connections.
> # Due to the heap usage, Hadoop 3.x KMS has more frequent GC activities, and 
> we observed up to 20% performance reduction due to GC.
> A possible solution is to reduce the idle timeout setting in HttpServer2. It 
> is currently hard-coded 10 seconds. By setting it to 1 second, open fds 
> dropped from 20 thousand down to 3 thousand in my experiment.
> File this jira to invite open discussion for a solution.
> Credit: [~mi...@cloudera.com] for the proposed Jetty idle timeout remedy; 
> [~xiaochen] for digging into this problem.
> Screenshots:
> CDH5 (Hadoop 2) KMS CPU utilization, resident memory and file descriptor 
> chart.
>  !Screen Shot 2018-08-22 at 4.30.39 PM.png! 
> CDH6 (Hadoop 3) KMS CPU utilization, resident memory and file descriptor 
> chart.
>  !Screen Shot 2018-08-22 at 4.30.32 PM.png! 
> CDH5 (Hadoop 2) GC activities on the KMS process
>  !Screen Shot 2018-08-22 at 4.26.51 PM.png! 
> CDH6 (Hadoop 3) GC activities on the KMS process
>  !Screen Shot 2018-08-22 at 4.27.02 PM.png! 
> JXray report
>  !Screen Shot 2018-08-22 at 11.36.16 AM.png! 
> open fd drops from 20 k down to 3k after the proposed change.
>  !Screen Shot 2018-08-24 at 7.08.16 PM.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15698) KMS log4j is not initialized properly at startup

2018-08-29 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597067#comment-16597067
 ] 

Xiao Chen edited comment on HADOOP-15698 at 8/30/18 5:11 AM:
-

Committed to trunk as well as branch-3.[0-1] (since I believe this is a 
regression from HADOOP-13597).
Thanks Kitti for the nice fix, and Ajay / Daniel for the review comments!


was (Author: xiaochen):
Committed to trunk as well as branch-3.[0-1] (since I believe this is a 
regression from HADOOP-13597).
Thanks Kitti for the nice fix, and Ajay / Daniel for the comments!

> KMS log4j is not initialized properly at startup
> 
>
> Key: HADOOP-15698
> URL: https://issues.apache.org/jira/browse/HADOOP-15698
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Fix For: 3.2.0, 3.0.4, 3.1.2
>
> Attachments: HADOOP-15698.001.patch
>
>
> During KMs startup, log4j logs don't show up resulting in important logs 
> getting omitted. This happens because log4 initialisation only happens in 
> KMSWebApp#contextInitialized and logs written before that don't show up.
> For example the following log never shows up:
> [https://github.com/apache/hadoop/blob/a55d6bba71c81c1c4e9d8cd11f55c78f10a548b0/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/util/ZKSignerSecretProvider.java#L197-L199]
> Another example is that the KMS startup message never shows up in the kms 
> logs.
> Note that this works in the unit tests, because MiniKMS sets the log4j system 
> property.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15698) KMS log4j is not initialized properly at startup

2018-08-29 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15698:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.2
   3.0.4
   3.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk as well as branch-3.[0-1] (since I believe this is a 
regression from HADOOP-13597).
Thanks Kitti for the nice fix, and Ajay / Daniel for the comments!

> KMS log4j is not initialized properly at startup
> 
>
> Key: HADOOP-15698
> URL: https://issues.apache.org/jira/browse/HADOOP-15698
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Fix For: 3.2.0, 3.0.4, 3.1.2
>
> Attachments: HADOOP-15698.001.patch
>
>
> During KMs startup, log4j logs don't show up resulting in important logs 
> getting omitted. This happens because log4 initialisation only happens in 
> KMSWebApp#contextInitialized and logs written before that don't show up.
> For example the following log never shows up:
> [https://github.com/apache/hadoop/blob/a55d6bba71c81c1c4e9d8cd11f55c78f10a548b0/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/util/ZKSignerSecretProvider.java#L197-L199]
> Another example is that the KMS startup message never shows up in the kms 
> logs.
> Note that this works in the unit tests, because MiniKMS sets the log4j system 
> property.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15698) KMS log4j is not initialized properly at startup

2018-08-29 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15698:
---
Summary: KMS log4j is not initialized properly at startup  (was: KMS 
startup logs don't show)

> KMS log4j is not initialized properly at startup
> 
>
> Key: HADOOP-15698
> URL: https://issues.apache.org/jira/browse/HADOOP-15698
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HADOOP-15698.001.patch
>
>
> During KMs startup, log4j logs don't show up resulting in important logs 
> getting omitted. This happens because log4 initialisation only happens in 
> KMSWebApp#contextInitialized and logs written before that don't show up.
> For example the following log never shows up:
> [https://github.com/apache/hadoop/blob/a55d6bba71c81c1c4e9d8cd11f55c78f10a548b0/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/util/ZKSignerSecretProvider.java#L197-L199]
> Another example is that the KMS startup message never shows up in the kms 
> logs.
> Note that this works in the unit tests, because MiniKMS sets the log4j system 
> property.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15684) triggerActiveLogRoll stuck on dead name node, when ConnectTimeoutException happens.

2018-08-29 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597030#comment-16597030
 ] 

Surendra Singh Lilhore commented on HADOOP-15684:
-

[~trjianjianjiao] : I didn't see any failure in {{TestJournalNodeSysnc}} for 
HDFS-13805. Test Result link for HDFS-13805 : Test report

Still I will check the failure, currently not able to open the test result for 
this jira, may be builds.apache.org is down. 

> triggerActiveLogRoll stuck on dead name node, when ConnectTimeoutException 
> happens. 
> 
>
> Key: HADOOP-15684
> URL: https://issues.apache.org/jira/browse/HADOOP-15684
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 3.0.0-alpha1
>Reporter: Rong Tang
>Assignee: Rong Tang
>Priority: Critical
> Attachments: 
> 0001-RollEditLog-try-next-NN-when-exception-happens.patch, 
> HADOOP-15684.000.patch, HADOOP-15684.001.patch, HADOOP-15684.002.patch, 
> hadoop--rollingUpgrade-SourceMachine001.log
>
>
> When name node call triggerActiveLogRoll, and the cachedActiveProxy is a dead 
> name node, it will throws a ConnectTimeoutException, expected behavior is to 
> try next NN, but current logic doesn't do so, instead, it keeps trying the 
> dead, mistakenly take it as active.
>  
> 2018-08-17 10:02:12,001 WARN [Edit log tailer] 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a 
> roll of the active NN
> org.apache.hadoop.net.ConnectTimeoutException: Call From 
> SourceMachine001/SourceIP to001 TargetMachine001.ap.gbl:8020 failed on socket 
> timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 2 
> millis timeout 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$2.doWork(EditLogTailer.java:298)
>  
> C:\Users\rotang>ping TargetMachine001
> Pinging TargetMachine001[TargetIP001] with 32 bytes of data:
>  Request timed out.
>  Request timed out.
>  Request timed out.
>  Request timed out.
>  Attachment is a log file saying how it repeatedly retries a dead name node, 
> and a fix patch.
>  I replaced the actual machine name/ip as SourceMachine001/SourceIP001 and 
> TargetMachine001/TargetIP001.
>  
> How to Repro:
> In a good running NNs, take down the active NN (don't let it come back during 
> test), and then the stand by NNs will keep trying dead (old active) NN, 
> because it is the cached one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15663) ABFS: Simplify configuration

2018-08-29 Thread Da Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596968#comment-16596968
 ] 

Da Zhou commented on HADOOP-15663:
--

Attaching HADOOP-15663-HADOOP-15407-005.patch. There is a conflict in 
HADOOP-15663-HADOOP-15407-004.patch:

> ABFS: Simplify configuration
> 
>
> Key: HADOOP-15663
> URL: https://issues.apache.org/jira/browse/HADOOP-15663
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Da Zhou
>Priority: Major
> Attachments: HADOOP-15663-HADOOP-15407-001.patch, 
> HADOOP-15663-HADOOP-15407-002.patch, HADOOP-15663-HADOOP-15407-003.patch, 
> HADOOP-15663-HADOOP-15407-004.patch, HADOOP-15663-HADOOP-15407-005.patch
>
>
> Configuration for WASB and ABFS is too complex.  The current approach is to 
> use four files for test configuration. 
> Both WASB and ABFS have basic test configuration which is committed to the 
> repo (azure-test.xml and azure-bfs-test.xml).  Currently these contain the 
> fs.AbstractFileSystem.[scheme].impl configuration, but otherwise are empty 
> except for an include reference to a file containing the endpoint 
> credentials. 
> Both WASB and ABFS have endpoint credential configuration files 
> (azure-auth-keys.xml and azure-bfs-auth-keys.xml).  These have been added to 
> .gitignore to prevent them from accidentally being submitted in a patch, 
> which would leak the developers storage account credentials.  These files 
> contain account names, storage account keys, and service endpoints.
> There is some overlap of the configuration for WASB and ABFS, where they use 
> the same property name but use different values.  
> 1) Let's reduce the number of test configuration files to one, if possible.
> 2) Let's simplify the account name, key, and endpoint configuration for WASB 
> and ABFS if possible, but still support the legacy way of doing it, which is 
> very error prone.
> 3) Let's improve error handling, so that typos or misconfiguration are not so 
> difficult to troubleshoot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15663) ABFS: Simplify configuration

2018-08-29 Thread Da Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Da Zhou updated HADOOP-15663:
-
Attachment: HADOOP-15663-HADOOP-15407-005.patch

> ABFS: Simplify configuration
> 
>
> Key: HADOOP-15663
> URL: https://issues.apache.org/jira/browse/HADOOP-15663
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Da Zhou
>Priority: Major
> Attachments: HADOOP-15663-HADOOP-15407-001.patch, 
> HADOOP-15663-HADOOP-15407-002.patch, HADOOP-15663-HADOOP-15407-003.patch, 
> HADOOP-15663-HADOOP-15407-004.patch, HADOOP-15663-HADOOP-15407-005.patch
>
>
> Configuration for WASB and ABFS is too complex.  The current approach is to 
> use four files for test configuration. 
> Both WASB and ABFS have basic test configuration which is committed to the 
> repo (azure-test.xml and azure-bfs-test.xml).  Currently these contain the 
> fs.AbstractFileSystem.[scheme].impl configuration, but otherwise are empty 
> except for an include reference to a file containing the endpoint 
> credentials. 
> Both WASB and ABFS have endpoint credential configuration files 
> (azure-auth-keys.xml and azure-bfs-auth-keys.xml).  These have been added to 
> .gitignore to prevent them from accidentally being submitted in a patch, 
> which would leak the developers storage account credentials.  These files 
> contain account names, storage account keys, and service endpoints.
> There is some overlap of the configuration for WASB and ABFS, where they use 
> the same property name but use different values.  
> 1) Let's reduce the number of test configuration files to one, if possible.
> 2) Let's simplify the account name, key, and endpoint configuration for WASB 
> and ABFS if possible, but still support the legacy way of doing it, which is 
> very error prone.
> 3) Let's improve error handling, so that typos or misconfiguration are not so 
> difficult to troubleshoot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15663) ABFS: Simplify configuration

2018-08-29 Thread Da Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596843#comment-16596843
 ] 

Da Zhou edited comment on HADOOP-15663 at 8/30/18 1:11 AM:
---

Thanks for the feedback.
 Attaching patch: HADOOP-15663-HADOOP-15407-004.patch:
 - updated the testing doc.
 - fixed typos and updated logger in AbfsTestUtils
 - removed "fs.AbstractFileSystem.wasb.impl " and " 
fs.azure.scale.test.enabled" from config file.
 - added "fs.AbstractFileSystem.wasb.impl" and 
"fs.AbstractFileSystem.wasbs.impl" into core-default.xml.
 - removed the isEmulator and renamed relevant methods.
 - Removed the commented ABFS configuration setting in azure-test.xml because 
it is already stated in testing_azure.


was (Author: danielzhou):
Thanks for the feedback.
Attaching patch: HADOOP-15663-HADOOP-15407-003.patch: 
-  updated the testing doc.
-  fixed typos and  updated logger in AbfsTestUtils
-  removed "fs.AbstractFileSystem.wasb.impl " and " 
fs.azure.scale.test.enabled" from config file.
-  added "fs.AbstractFileSystem.wasb.impl" and 
"fs.AbstractFileSystem.wasbs.impl" into core-default.xml.
- removed the isEmulator and renamed relevant methods.
- Removed the commented ABFS configuration setting in azure-test.xml because it 
is already stated in testing_azure.

> ABFS: Simplify configuration
> 
>
> Key: HADOOP-15663
> URL: https://issues.apache.org/jira/browse/HADOOP-15663
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Da Zhou
>Priority: Major
> Attachments: HADOOP-15663-HADOOP-15407-001.patch, 
> HADOOP-15663-HADOOP-15407-002.patch, HADOOP-15663-HADOOP-15407-003.patch, 
> HADOOP-15663-HADOOP-15407-004.patch
>
>
> Configuration for WASB and ABFS is too complex.  The current approach is to 
> use four files for test configuration. 
> Both WASB and ABFS have basic test configuration which is committed to the 
> repo (azure-test.xml and azure-bfs-test.xml).  Currently these contain the 
> fs.AbstractFileSystem.[scheme].impl configuration, but otherwise are empty 
> except for an include reference to a file containing the endpoint 
> credentials. 
> Both WASB and ABFS have endpoint credential configuration files 
> (azure-auth-keys.xml and azure-bfs-auth-keys.xml).  These have been added to 
> .gitignore to prevent them from accidentally being submitted in a patch, 
> which would leak the developers storage account credentials.  These files 
> contain account names, storage account keys, and service endpoints.
> There is some overlap of the configuration for WASB and ABFS, where they use 
> the same property name but use different values.  
> 1) Let's reduce the number of test configuration files to one, if possible.
> 2) Let's simplify the account name, key, and endpoint configuration for WASB 
> and ABFS if possible, but still support the legacy way of doing it, which is 
> very error prone.
> 3) Let's improve error handling, so that typos or misconfiguration are not so 
> difficult to troubleshoot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15407) Support Windows Azure Storage - Blob file system in Hadoop

2018-08-29 Thread Da Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596884#comment-16596884
 ] 

Da Zhou commented on HADOOP-15407:
--

Yes, [~sunilg], it will be ready. We are now working on  JIRAS mentioned by 
[~tmarquardt] in his last comment, most of them are already under review.

> Support Windows Azure Storage - Blob file system in Hadoop
> --
>
> Key: HADOOP-15407
> URL: https://issues.apache.org/jira/browse/HADOOP-15407
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/azure
>Affects Versions: 3.2.0
>Reporter: Esfandiar Manii
>Assignee: Da Zhou
>Priority: Blocker
> Attachments: HADOOP-15407-001.patch, HADOOP-15407-002.patch, 
> HADOOP-15407-003.patch, HADOOP-15407-004.patch, HADOOP-15407-008.patch, 
> HADOOP-15407-HADOOP-15407-008.patch, HADOOP-15407-HADOOP-15407.006.patch, 
> HADOOP-15407-HADOOP-15407.007.patch, HADOOP-15407-HADOOP-15407.008.patch
>
>
> *{color:#212121}Description{color}*
>  This JIRA adds a new file system implementation, ABFS, for running Big Data 
> and Analytics workloads against Azure Storage. This is a complete rewrite of 
> the previous WASB driver with a heavy focus on optimizing both performance 
> and cost.
>  {color:#212121} {color}
>  *{color:#212121}High level design{color}*
>  At a high level, the code here extends the FileSystem class to provide an 
> implementation for accessing blobs in Azure Storage. The scheme abfs is used 
> for accessing it over HTTP, and abfss for accessing over HTTPS. The following 
> URI scheme is used to address individual paths:
>  {color:#212121} {color}
>  
> {color:#212121}abfs[s]://@.dfs.core.windows.net/{color}
>  {color:#212121} {color}
>  {color:#212121}ABFS is intended as a replacement to WASB. WASB is not 
> deprecated but is in pure maintenance mode and customers should upgrade to 
> ABFS once it hits General Availability later in CY18.{color}
>  {color:#212121}Benefits of ABFS include:{color}
>  {color:#212121}· Higher scale (capacity, throughput, and IOPS) Big 
> Data and Analytics workloads by allowing higher limits on storage 
> accounts{color}
>  {color:#212121}· Removing any ramp up time with Storage backend 
> partitioning; blocks are now automatically sharded across partitions in the 
> Storage backend{color}
> {color:#212121}          .         This avoids the need for using 
> temporary/intermediate files, increasing the cost (and framework complexity 
> around committing jobs/tasks){color}
>  {color:#212121}· Enabling much higher read and write throughput on 
> single files (tens of Gbps by default){color}
>  {color:#212121}· Still retaining all of the Azure Blob features 
> customers are familiar with and expect, and gaining the benefits of future 
> Blob features as well{color}
>  {color:#212121}ABFS incorporates Hadoop Filesystem metrics to monitor the 
> file system throughput and operations. Ambari metrics are not currently 
> implemented for ABFS, but will be available soon.{color}
>  {color:#212121} {color}
>  *{color:#212121}Credits and history{color}*
>  Credit for this work goes to (hope I don't forget anyone): Shane Mainali, 
> {color:#212121}Thomas Marquardt, Zichen Sun, Georgi Chalakov, Esfandiar 
> Manii, Amit Singh, Dana Kaban, Da Zhou, Junhua Gu, Saher Ahwal, Saurabh Pant, 
> and James Baker. {color}
>  {color:#212121} {color}
>  *Test*
>  ABFS has gone through many test procedures including Hadoop file system 
> contract tests, unit testing, functional testing, and manual testing. All the 
> Junit tests provided with the driver are capable of running in both 
> sequential/parallel fashion in order to reduce the testing time.
>  {color:#212121}Besides unit tests, we have used ABFS as the default file 
> system in Azure HDInsight. Azure HDInsight will very soon offer ABFS as a 
> storage option. (HDFS is also used but not as default file system.) Various 
> different customer and test workloads have been run against clusters with 
> such configurations for quite some time. Benchmarks such as Tera*, TPC-DS, 
> Spark Streaming and Spark SQL, and others have been run to do scenario, 
> performance, and functional testing. Third parties and customers have also 
> done various testing of ABFS.{color}
>  {color:#212121}The current version reflects to the version of the code 
> tested and used in our production environment.{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15705) Typo in the definition of "stable" in the interface classification

2018-08-29 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596856#comment-16596856
 ] 

Daniel Templeton commented on HADOOP-15705:
---

Filed HADOOP-15706 for the other typo you found.

> Typo in the definition of "stable" in the interface classification
> --
>
> Key: HADOOP-15705
> URL: https://issues.apache.org/jira/browse/HADOOP-15705
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Minor
> Fix For: 3.2.0, 3.0.4, 3.1.2
>
> Attachments: HADOOP-15705.001.patch
>
>
> "Compatible changes allowed: maintenance (x.Y.0)"
> should be 
> "Compatible changes allowed: maintenance (x.y.Z)"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15706) Typo in compatibility doc: SHOUD -> SHOULD

2018-08-29 Thread Daniel Templeton (JIRA)
Daniel Templeton created HADOOP-15706:
-

 Summary: Typo in compatibility doc: SHOUD -> SHOULD
 Key: HADOOP-15706
 URL: https://issues.apache.org/jira/browse/HADOOP-15706
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Daniel Templeton
Assignee: Daniel Templeton


{quote}% grep SHOUD 
./hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md
Apache Hadoop revisions SHOUD retain binary compatability such that 
end-user{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15705) Typo in the definition of "stable" in the interface classification

2018-08-29 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HADOOP-15705:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.2
   3.0.4
   3.2.0
   Status: Resolved  (was: Patch Available)

Thanks for the review, [~jojochuang].  Committed to trunk, branch-3.1, and 
branch-3.0.

> Typo in the definition of "stable" in the interface classification
> --
>
> Key: HADOOP-15705
> URL: https://issues.apache.org/jira/browse/HADOOP-15705
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Minor
> Fix For: 3.2.0, 3.0.4, 3.1.2
>
> Attachments: HADOOP-15705.001.patch
>
>
> "Compatible changes allowed: maintenance (x.Y.0)"
> should be 
> "Compatible changes allowed: maintenance (x.y.Z)"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15628) S3A Filesystem does not check return from AmazonS3Client deleteObjects

2018-08-29 Thread Steve Jacobs (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596852#comment-16596852
 ] 

Steve Jacobs commented on HADOOP-15628:
---

I think this is fixed in 3.1, at least from my look at the code. I don't have 
an ability to test this from hive currently which is where we were seeing the 
error.

On 8/29/18, 11:01 AM, "Steve Loughran (JIRA)"  wrote:


[ 
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHADOOP-15628%3Fpage%3Dcom.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel%26focusedCommentId%3D16596602%23comment-16596602data=02%7C01%7Csjacobs%40battelleecology.org%7C7cb8f13c8b334b1333a408d60dd100e9%7Cf44d2ab390994d85998610165a8619f5%7C0%7C0%7C636711588646499813sdata=U5uohXfg71E2F9XDn0qrfhm%2Fhc0zu%2BgKBrxyyuLZmK0%3Dreserved=0
 ] 

Steve Loughran commented on HADOOP-15628:
-

[~steveatbat]: if you can do a patch for this ASAP we can still get it in 
to 3.2. I don't see how it can be tested, other than regression testing & review

> S3A Filesystem does not check return from AmazonS3Client deleteObjects
> --
>
> Key: HADOOP-15628
> URL: 
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHADOOP-15628data=02%7C01%7Csjacobs%40battelleecology.org%7C7cb8f13c8b334b1333a408d60dd100e9%7Cf44d2ab390994d85998610165a8619f5%7C0%7C0%7C636711588646499813sdata=JZTQFE%2BG788%2FE%2BFEE283jv9Ieig4IWCdxKyLKahTJVk%3Dreserved=0
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.9.1, 2.8.4, 3.1.1, 3.0.3
> Environment: Hadoop 3.0.2 / Hadoop 2.8.3
> Hive 2.3.2 / Hive 2.3.3 / Hive 3.0.0
> Non-AWS S3 implementation
>Reporter: Steve Jacobs
>Priority: Minor
>
> Deletes in S3A that use the Multi-Delete functionality in the Amazon S3 
api do not check to see if all objects have been succesfully delete. In the 
event of a failure, the api will still return a 200 OK (which isn't checked 
currently):
> [Delete Code from Hadoop 
2.8|https://github.com/apache/hadoop/blob/a0da1ec01051108b77f86799dd5e97563b2a3962/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L574]
 
> {code:java}
> if (keysToDelete.size() == MAX_ENTRIES_TO_DELETE) {
> DeleteObjectsRequest deleteRequest =
> new DeleteObjectsRequest(bucket).withKeys(keysToDelete);
> s3.deleteObjects(deleteRequest);
> statistics.incrementWriteOps(1);
> keysToDelete.clear();
> }
> {code}
> This should be converted to use the DeleteObjectsResult class from the 
S3Client: 
> [Amazon Code 
Example|https://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingMultipleObjectsUsingJava.htm]
> {code:java}
> // Verify that the objects were deleted successfully.
> DeleteObjectsResult delObjRes = 
s3Client.deleteObjects(multiObjectDeleteRequest); int successfulDeletes = 
delObjRes.getDeletedObjects().size();
> System.out.println(successfulDeletes + " objects successfully deleted.");
> {code}
> Bucket policies can be misconfigured, and deletes will fail without 
warning by S3A clients.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


> S3A Filesystem does not check return from AmazonS3Client deleteObjects
> --
>
> Key: HADOOP-15628
> URL: https://issues.apache.org/jira/browse/HADOOP-15628
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.1, 2.8.4, 3.1.1, 3.0.3
> Environment: Hadoop 3.0.2 / Hadoop 2.8.3
> Hive 2.3.2 / Hive 2.3.3 / Hive 3.0.0
> Non-AWS S3 implementation
>Reporter: Steve Jacobs
>Priority: Minor
>
> Deletes in S3A that use the Multi-Delete functionality in the Amazon S3 api 
> do not check to see if all objects have been succesfully delete. In the event 
> of a failure, the api will still return a 200 OK (which isn't checked 
> currently):
> [Delete Code from Hadoop 
> 2.8|https://github.com/apache/hadoop/blob/a0da1ec01051108b77f86799dd5e97563b2a3962/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L574]
>  
> {code:java}
> if (keysToDelete.size() == MAX_ENTRIES_TO_DELETE) {
> DeleteObjectsRequest deleteRequest =
> new DeleteObjectsRequest(bucket).withKeys(keysToDelete);
> s3.deleteObjects(deleteRequest);
> statistics.incrementWriteOps(1);
> keysToDelete.clear();
> }
> {code}
> This should be converted to use the DeleteObjectsResult class 

[jira] [Comment Edited] (HADOOP-15663) ABFS: Simplify configuration

2018-08-29 Thread Da Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596843#comment-16596843
 ] 

Da Zhou edited comment on HADOOP-15663 at 8/29/18 8:56 PM:
---

Thanks for the feedback.
Attaching patch: HADOOP-15663-HADOOP-15407-003.patch: 
-  updated the testing doc.
-  fixed typos and  updated logger in AbfsTestUtils
-  removed "fs.AbstractFileSystem.wasb.impl " and " 
fs.azure.scale.test.enabled" from config file.
-  added "fs.AbstractFileSystem.wasb.impl" and 
"fs.AbstractFileSystem.wasbs.impl" into core-default.xml.
- removed the isEmulator and renamed relevant methods.
- Removed the commented ABFS configuration setting in azure-test.xml because it 
is already stated in testing_azure.


was (Author: danielzhou):
Thanks for the feedback.
Attaching patch: HADOOP-15663-HADOOP-15407-003.patch: 
-  updated the testing doc.
-  fixed typos and  updated logger in AbfsTestUtils
-  removed "fs.AbstractFileSystem.wasb.impl " and " 
fs.azure.scale.test.enabled" from config file.
-  added "fs.AbstractFileSystem.wasb.impl" and 
"fs.AbstractFileSystem.wasbs.impl" into core-default.xml.

> ABFS: Simplify configuration
> 
>
> Key: HADOOP-15663
> URL: https://issues.apache.org/jira/browse/HADOOP-15663
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Da Zhou
>Priority: Major
> Attachments: HADOOP-15663-HADOOP-15407-001.patch, 
> HADOOP-15663-HADOOP-15407-002.patch, HADOOP-15663-HADOOP-15407-003.patch, 
> HADOOP-15663-HADOOP-15407-004.patch
>
>
> Configuration for WASB and ABFS is too complex.  The current approach is to 
> use four files for test configuration. 
> Both WASB and ABFS have basic test configuration which is committed to the 
> repo (azure-test.xml and azure-bfs-test.xml).  Currently these contain the 
> fs.AbstractFileSystem.[scheme].impl configuration, but otherwise are empty 
> except for an include reference to a file containing the endpoint 
> credentials. 
> Both WASB and ABFS have endpoint credential configuration files 
> (azure-auth-keys.xml and azure-bfs-auth-keys.xml).  These have been added to 
> .gitignore to prevent them from accidentally being submitted in a patch, 
> which would leak the developers storage account credentials.  These files 
> contain account names, storage account keys, and service endpoints.
> There is some overlap of the configuration for WASB and ABFS, where they use 
> the same property name but use different values.  
> 1) Let's reduce the number of test configuration files to one, if possible.
> 2) Let's simplify the account name, key, and endpoint configuration for WASB 
> and ABFS if possible, but still support the legacy way of doing it, which is 
> very error prone.
> 3) Let's improve error handling, so that typos or misconfiguration are not so 
> difficult to troubleshoot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15663) ABFS: Simplify configuration

2018-08-29 Thread Da Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596843#comment-16596843
 ] 

Da Zhou commented on HADOOP-15663:
--

Thanks for the feedback.
Attaching patch: HADOOP-15663-HADOOP-15407-003.patch: 
-  updated the testing doc.
-  fixed typos and  updated logger in AbfsTestUtils
-  removed "fs.AbstractFileSystem.wasb.impl " and " 
fs.azure.scale.test.enabled" from config file.
-  added "fs.AbstractFileSystem.wasb.impl" and 
"fs.AbstractFileSystem.wasbs.impl" into core-default.xml.

> ABFS: Simplify configuration
> 
>
> Key: HADOOP-15663
> URL: https://issues.apache.org/jira/browse/HADOOP-15663
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Da Zhou
>Priority: Major
> Attachments: HADOOP-15663-HADOOP-15407-001.patch, 
> HADOOP-15663-HADOOP-15407-002.patch, HADOOP-15663-HADOOP-15407-003.patch, 
> HADOOP-15663-HADOOP-15407-004.patch
>
>
> Configuration for WASB and ABFS is too complex.  The current approach is to 
> use four files for test configuration. 
> Both WASB and ABFS have basic test configuration which is committed to the 
> repo (azure-test.xml and azure-bfs-test.xml).  Currently these contain the 
> fs.AbstractFileSystem.[scheme].impl configuration, but otherwise are empty 
> except for an include reference to a file containing the endpoint 
> credentials. 
> Both WASB and ABFS have endpoint credential configuration files 
> (azure-auth-keys.xml and azure-bfs-auth-keys.xml).  These have been added to 
> .gitignore to prevent them from accidentally being submitted in a patch, 
> which would leak the developers storage account credentials.  These files 
> contain account names, storage account keys, and service endpoints.
> There is some overlap of the configuration for WASB and ABFS, where they use 
> the same property name but use different values.  
> 1) Let's reduce the number of test configuration files to one, if possible.
> 2) Let's simplify the account name, key, and endpoint configuration for WASB 
> and ABFS if possible, but still support the legacy way of doing it, which is 
> very error prone.
> 3) Let's improve error handling, so that typos or misconfiguration are not so 
> difficult to troubleshoot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15663) ABFS: Simplify configuration

2018-08-29 Thread Da Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Da Zhou updated HADOOP-15663:
-
Attachment: HADOOP-15663-HADOOP-15407-004.patch

> ABFS: Simplify configuration
> 
>
> Key: HADOOP-15663
> URL: https://issues.apache.org/jira/browse/HADOOP-15663
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Da Zhou
>Priority: Major
> Attachments: HADOOP-15663-HADOOP-15407-001.patch, 
> HADOOP-15663-HADOOP-15407-002.patch, HADOOP-15663-HADOOP-15407-003.patch, 
> HADOOP-15663-HADOOP-15407-004.patch
>
>
> Configuration for WASB and ABFS is too complex.  The current approach is to 
> use four files for test configuration. 
> Both WASB and ABFS have basic test configuration which is committed to the 
> repo (azure-test.xml and azure-bfs-test.xml).  Currently these contain the 
> fs.AbstractFileSystem.[scheme].impl configuration, but otherwise are empty 
> except for an include reference to a file containing the endpoint 
> credentials. 
> Both WASB and ABFS have endpoint credential configuration files 
> (azure-auth-keys.xml and azure-bfs-auth-keys.xml).  These have been added to 
> .gitignore to prevent them from accidentally being submitted in a patch, 
> which would leak the developers storage account credentials.  These files 
> contain account names, storage account keys, and service endpoints.
> There is some overlap of the configuration for WASB and ABFS, where they use 
> the same property name but use different values.  
> 1) Let's reduce the number of test configuration files to one, if possible.
> 2) Let's simplify the account name, key, and endpoint configuration for WASB 
> and ABFS if possible, but still support the legacy way of doing it, which is 
> very error prone.
> 3) Let's improve error handling, so that typos or misconfiguration are not so 
> difficult to troubleshoot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15663) ABFS: Simplify configuration

2018-08-29 Thread Thomas Marquardt (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596835#comment-16596835
 ] 

Thomas Marquardt commented on HADOOP-15663:
---

LGTM too.  Some comments:

ConfigurationKeys.java
  L40: Let's remove FS_AZURE_EMULATOR_ENABLED.

UriUtils.java
  L48: extractAccountNameFromHostName would describe this function better

AbfsConfiguration
  L176: Let's remove isEmulator() and FS_AZURE_EMULATOR_ENABLED

AzureBlobFileSystemStore.java
  L151: Really this is about using test endpoints that use the URL
   format http[s]://[ip]:[port]/[account]/[filesystem] instead of
http[s]://[account][domain-suffix]/[filesystem].

AzureTestConstants.java
  L147: Now we can name it ACCOUNT_NAME_PROPERTY_NAME and alphabetize 
           under ACCOUNT_KEY_PROPERTY_NAME.

AbstractAbfsIntegrationTest.java
 L65: It is not emulator, but rather the fact that we are using an IP Address
 for testing purposes instead of a Host name.

> ABFS: Simplify configuration
> 
>
> Key: HADOOP-15663
> URL: https://issues.apache.org/jira/browse/HADOOP-15663
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Da Zhou
>Priority: Major
> Attachments: HADOOP-15663-HADOOP-15407-001.patch, 
> HADOOP-15663-HADOOP-15407-002.patch, HADOOP-15663-HADOOP-15407-003.patch
>
>
> Configuration for WASB and ABFS is too complex.  The current approach is to 
> use four files for test configuration. 
> Both WASB and ABFS have basic test configuration which is committed to the 
> repo (azure-test.xml and azure-bfs-test.xml).  Currently these contain the 
> fs.AbstractFileSystem.[scheme].impl configuration, but otherwise are empty 
> except for an include reference to a file containing the endpoint 
> credentials. 
> Both WASB and ABFS have endpoint credential configuration files 
> (azure-auth-keys.xml and azure-bfs-auth-keys.xml).  These have been added to 
> .gitignore to prevent them from accidentally being submitted in a patch, 
> which would leak the developers storage account credentials.  These files 
> contain account names, storage account keys, and service endpoints.
> There is some overlap of the configuration for WASB and ABFS, where they use 
> the same property name but use different values.  
> 1) Let's reduce the number of test configuration files to one, if possible.
> 2) Let's simplify the account name, key, and endpoint configuration for WASB 
> and ABFS if possible, but still support the legacy way of doing it, which is 
> very error prone.
> 3) Let's improve error handling, so that typos or misconfiguration are not so 
> difficult to troubleshoot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15705) Typo in the definition of "stable" in the interface classification

2018-08-29 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596828#comment-16596828
 ] 

Wei-Chiu Chuang commented on HADOOP-15705:
--

+1 thanks for clarification [~templedf]

 

BTW, there's another typo in Compatibility doc: 
[http://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/Compatibility.html]
{quote}Apache Hadoop revisions SHOUD retain binary 
{quote}
SHOUD --> SHOULD

> Typo in the definition of "stable" in the interface classification
> --
>
> Key: HADOOP-15705
> URL: https://issues.apache.org/jira/browse/HADOOP-15705
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Minor
> Attachments: HADOOP-15705.001.patch
>
>
> "Compatible changes allowed: maintenance (x.Y.0)"
> should be 
> "Compatible changes allowed: maintenance (x.y.Z)"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14630) Contract Tests to verify create, mkdirs and rename under a file is forbidden

2018-08-29 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596817#comment-16596817
 ] 

Aaron Fabbri commented on HADOOP-14630:
---

+1 on the patch as soon as you get a green result from yetus.

I did not test this patch, just reviewed by inspection. I trust you to run the 
tests before committing.

> Contract Tests to verify create, mkdirs and rename under a file is forbidden
> 
>
> Key: HADOOP-14630
> URL: https://issues.apache.org/jira/browse/HADOOP-14630
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/azure, fs/s3, fs/swift
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-14630-001.patch, HADOOP-14630-002.patch, 
> HADOOP-14630-003.patch, HADOOP-14630-004.patch
>
>
> Object stores can get into trouble in ways which an FS would never, do, ways 
> so obvious we've never done tests for them. We know what the problems are: 
> test for file and dir creation directly/indirectly under other files
> * mkdir(file/file)
> * mkdir(file/subdir)
> * dir under file/subdir/subdir
> * dir/dir2/file, verify dir & dir2 exist
> * dir/dir2/dir3, verify dir & dir2 exist 
> * rename(src, file/dest)
> * rename(src, file/dir/dest)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15705) Typo in the definition of "stable" in the interface classification

2018-08-29 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HADOOP-15705:
--
Status: Patch Available  (was: Open)

> Typo in the definition of "stable" in the interface classification
> --
>
> Key: HADOOP-15705
> URL: https://issues.apache.org/jira/browse/HADOOP-15705
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Minor
> Attachments: HADOOP-15705.001.patch
>
>
> "Compatible changes allowed: maintenance (x.Y.0)"
> should be 
> "Compatible changes allowed: maintenance (x.y.Z)"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15705) Typo in the definition of "stable" in the interface classification

2018-08-29 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HADOOP-15705:
--
Attachment: HADOOP-15705.001.patch

> Typo in the definition of "stable" in the interface classification
> --
>
> Key: HADOOP-15705
> URL: https://issues.apache.org/jira/browse/HADOOP-15705
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Minor
> Attachments: HADOOP-15705.001.patch
>
>
> "Compatible changes allowed: maintenance (x.Y.0)"
> should be 
> "Compatible changes allowed: maintenance (x.y.Z)"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15705) Typo in the definition of "stable" in the interface classification

2018-08-29 Thread Daniel Templeton (JIRA)
Daniel Templeton created HADOOP-15705:
-

 Summary: Typo in the definition of "stable" in the interface 
classification
 Key: HADOOP-15705
 URL: https://issues.apache.org/jira/browse/HADOOP-15705
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Daniel Templeton
Assignee: Daniel Templeton


"Compatible changes allowed: maintenance (x.Y.0)"

should be 

"Compatible changes allowed: maintenance (x.y.Z)"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15107) Stabilize/tune S3A committers; review correctness & docs

2018-08-29 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596784#comment-16596784
 ] 

Aaron Fabbri commented on HADOOP-15107:
---

+1 on the v4 patch. Code all looks good.

{noformat}
+} else {
+  LOG.warn("Using standard FileOutputCommitter to commit work."
+  + " This is slow and potentially unsafe.");
+  return createFileOutputCommitter(outputPath, context);{noformat}

Good call, I like it.

On the docs changes, just some random questions:
{noformat}
```python
def recoverTask(tac):
  oldAttemptId = appAttemptId - 1
{noformat}
Interesting. New commit attempts always get the same attempt id +1? (I don't 
know how those are allocated)

The mergePathsV1 seems pretty straightforward.  Not sure why the actual code is 
so complicated.  Your pseudocode representation seems fairly intuitive.  
Overwriting stuff that exists in the destination, recursively so you don't just 
nuke directories that exist in the destination, instead descending and removing 
destination conflicts as they arise (files).  Special case if src is file but 
dest is dir (nuke dest).

{noformat}
### v2 Job Recovery Before `commitJob()`


Because the data has been renamed into the destination directory, all tasks
recorded as having being committed have no recovery needed at all:

```python
def recoverTask(tac):
```

All active and queued tasks are scheduled for execution.

There is a weakness here, the same one on a failure during `commitTask()`:
it is only safe to repeat a task which failed during that commit operation
if the name of all generated files are constant across all task attempts.

If the Job AM fails while a task attempt has been instructed to commit,
and that commit is not recorded as having completed, the state of that
in-progress task is unknown...really it isn't be safe to recover the
job at this point.
{noformat}

Interesting. What happens in this case?  Is it detected? Do we get duplicate 
data in the final job (re-attempt) output?

> Stabilize/tune S3A committers; review correctness & docs
> 
>
> Key: HADOOP-15107
> URL: https://issues.apache.org/jira/browse/HADOOP-15107
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15107-001.patch, HADOOP-15107-002.patch, 
> HADOOP-15107-003.patch, HADOOP-15107-004.patch
>
>
> I'm writing about the paper on the committers, one which, being a proper 
> paper, requires me to show the committers work.
> # define the requirements of a "Correct" committed job (this applies to the 
> FileOutputCommitter too)
> # show that the Staging committer meets these requirements (most of this is 
> implicit in that it uses the V1 FileOutputCommitter to marshall .pendingset 
> lists from committed tasks to the final destination, where they are read and 
> committed.
> # Show the magic committer also works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10948) SwiftNativeFileSystem's directory is incompatible with Swift and Horizon

2018-08-29 Thread Shyamsundar Ramanathan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596780#comment-16596780
 ] 

Shyamsundar Ramanathan commented on HADOOP-10948:
-

* We applied the patch and have noticed a significant degradation in 
performance, specifically in listing files (may affect other areas too). * We 
found that in the patch, an additional head call is made to each subpath when 
listing a dir (to determine if the manifest header exists etc).
 * This significantly affects performance especially if the number of files in 
a particular "directory" are large. (few seconds without the patch vs minutes 
with the patch for nFiles > 500~1000) (due to increased number of rest calls)

Specifically these lines cause the issue:
{code}
+  if (!name.endsWith("/")) {
+final Path filePath = getCorrectSwiftPath(new Path(name));
+files.add(getObjectMetadata(filePath, newest));
+  } else {
+final Path dirPath = getCorrectSwiftPath(toDirPath(new Path(name)));
+files.add(getObjectMetadata(dirPath, newest));
   }
{code}


> SwiftNativeFileSystem's directory is incompatible with Swift and Horizon
> 
>
> Key: HADOOP-10948
> URL: https://issues.apache.org/jira/browse/HADOOP-10948
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Affects Versions: 3.0.0-alpha1
>Reporter: Kazuki OIKAWA
>Assignee: Kazuki OIKAWA
>Priority: Major
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10948-2.patch, HADOOP-10948.patch
>
>
> SwiftNativeFileSystem's directory representation is zero-byte file.
> But in Swift / Horizon, directory representation is a trailing-slash.
> This incompatibility has the following issues.
> * SwiftNativeFileSystem can't see pseudo-directory made by OpenStack Horizon
> * Swift/Horizon can't see pseudo-directory made by SwiftNativeFileSystem. But 
> Swift/Horizon see a zero-byte file instead of that pseudo-directory.
> * SwiftNativeFileSystem can't see a file if there is no intermediate 
> pseudo-directory object.
> * SwiftNativeFileSystem makes two objects when making a single directory
> (e.g. "hadoop fs -mkdir swift://test.test/dir/" => "dir" and "dir/" created)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15684) triggerActiveLogRoll stuck on dead name node, when ConnectTimeoutException happens.

2018-08-29 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HADOOP-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596704#comment-16596704
 ] 

Íñigo Goiri commented on HADOOP-15684:
--

I'll follow up in HDFS-13805 for the {{TestJournalNodeSync}}, it didn't seem to 
break there, not sure why.

For the review, we should ask people involved in the review of HDFS-6440.

> triggerActiveLogRoll stuck on dead name node, when ConnectTimeoutException 
> happens. 
> 
>
> Key: HADOOP-15684
> URL: https://issues.apache.org/jira/browse/HADOOP-15684
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 3.0.0-alpha1
>Reporter: Rong Tang
>Assignee: Rong Tang
>Priority: Critical
> Attachments: 
> 0001-RollEditLog-try-next-NN-when-exception-happens.patch, 
> HADOOP-15684.000.patch, HADOOP-15684.001.patch, HADOOP-15684.002.patch, 
> hadoop--rollingUpgrade-SourceMachine001.log
>
>
> When name node call triggerActiveLogRoll, and the cachedActiveProxy is a dead 
> name node, it will throws a ConnectTimeoutException, expected behavior is to 
> try next NN, but current logic doesn't do so, instead, it keeps trying the 
> dead, mistakenly take it as active.
>  
> 2018-08-17 10:02:12,001 WARN [Edit log tailer] 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a 
> roll of the active NN
> org.apache.hadoop.net.ConnectTimeoutException: Call From 
> SourceMachine001/SourceIP to001 TargetMachine001.ap.gbl:8020 failed on socket 
> timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 2 
> millis timeout 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$2.doWork(EditLogTailer.java:298)
>  
> C:\Users\rotang>ping TargetMachine001
> Pinging TargetMachine001[TargetIP001] with 32 bytes of data:
>  Request timed out.
>  Request timed out.
>  Request timed out.
>  Request timed out.
>  Attachment is a log file saying how it repeatedly retries a dead name node, 
> and a fix patch.
>  I replaced the actual machine name/ip as SourceMachine001/SourceIP001 and 
> TargetMachine001/TargetIP001.
>  
> How to Repro:
> In a good running NNs, take down the active NN (don't let it come back during 
> test), and then the stand by NNs will keep trying dead (old active) NN, 
> because it is the cached one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15684) triggerActiveLogRoll stuck on dead name node, when ConnectTimeoutException happens.

2018-08-29 Thread Rong Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596696#comment-16596696
 ] 

Rong Tang commented on HADOOP-15684:


[~elgoiri]    For test case TestJournalNodeSync,  it also fails without my 
change. and it may be introduced in by  this  [commit 
|https://github.com/apache/hadoop/commit/96c4575d7373079becfa3e3db29ba98e6fb86388#diff-85c652b4af8b1cd3e90bda0fe753e218],
 add committer [~surendrasingh] to verify.

[~sunilg]  What should I do to commit?

Add [~vinayrpet]  who seems to have the privilege to commit changes.

> triggerActiveLogRoll stuck on dead name node, when ConnectTimeoutException 
> happens. 
> 
>
> Key: HADOOP-15684
> URL: https://issues.apache.org/jira/browse/HADOOP-15684
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 3.0.0-alpha1
>Reporter: Rong Tang
>Assignee: Rong Tang
>Priority: Critical
> Attachments: 
> 0001-RollEditLog-try-next-NN-when-exception-happens.patch, 
> HADOOP-15684.000.patch, HADOOP-15684.001.patch, HADOOP-15684.002.patch, 
> hadoop--rollingUpgrade-SourceMachine001.log
>
>
> When name node call triggerActiveLogRoll, and the cachedActiveProxy is a dead 
> name node, it will throws a ConnectTimeoutException, expected behavior is to 
> try next NN, but current logic doesn't do so, instead, it keeps trying the 
> dead, mistakenly take it as active.
>  
> 2018-08-17 10:02:12,001 WARN [Edit log tailer] 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a 
> roll of the active NN
> org.apache.hadoop.net.ConnectTimeoutException: Call From 
> SourceMachine001/SourceIP to001 TargetMachine001.ap.gbl:8020 failed on socket 
> timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 2 
> millis timeout 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$2.doWork(EditLogTailer.java:298)
>  
> C:\Users\rotang>ping TargetMachine001
> Pinging TargetMachine001[TargetIP001] with 32 bytes of data:
>  Request timed out.
>  Request timed out.
>  Request timed out.
>  Request timed out.
>  Attachment is a log file saying how it repeatedly retries a dead name node, 
> and a fix patch.
>  I replaced the actual machine name/ip as SourceMachine001/SourceIP001 and 
> TargetMachine001/TargetIP001.
>  
> How to Repro:
> In a good running NNs, take down the active NN (don't let it come back during 
> test), and then the stand by NNs will keep trying dead (old active) NN, 
> because it is the cached one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15107) Stabilize/tune S3A committers; review correctness & docs

2018-08-29 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596683#comment-16596683
 ] 

Aaron Fabbri commented on HADOOP-15107:
---

I don't want to rob others of the joys of learning the new committers, but I 
can review the code (patch) today.

> Stabilize/tune S3A committers; review correctness & docs
> 
>
> Key: HADOOP-15107
> URL: https://issues.apache.org/jira/browse/HADOOP-15107
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15107-001.patch, HADOOP-15107-002.patch, 
> HADOOP-15107-003.patch, HADOOP-15107-004.patch
>
>
> I'm writing about the paper on the committers, one which, being a proper 
> paper, requires me to show the committers work.
> # define the requirements of a "Correct" committed job (this applies to the 
> FileOutputCommitter too)
> # show that the Staging committer meets these requirements (most of this is 
> implicit in that it uses the V1 FileOutputCommitter to marshall .pendingset 
> lists from committed tasks to the final destination, where they are read and 
> committed.
> # Show the magic committer also works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15658) Memory leak in S3AOutputStream

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596667#comment-16596667
 ] 

Steve Loughran commented on HADOOP-15658:
-

I've just moved this to being a 3.3 feature, but if we can get a patch in this 
week then it'll get into 3.2. I don't have time to work on this, but I'm 
willing to review & test patches from others

> Memory leak in S3AOutputStream
> --
>
> Key: HADOOP-15658
> URL: https://issues.apache.org/jira/browse/HADOOP-15658
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.4
>Reporter: Piotr Nowojski
>Priority: Major
>
> S3AOutputStream by calling 
> {{org.apache.hadoop.fs.s3a.S3AFileSystem#createTmpFileForWrite}} indirectly 
> calls {{java.io.File#deleteOnExit}} and {{java.io.DeleteOnExitHook}} which 
> are known for memory leaking:
>   
>  [https://bugs.java.com/view_bug.do?bug_id=6664633]
>  [https://bugs.java.com/view_bug.do?bug_id=4872014]
>   
>  Apparently it was even fixed (same bug but in unrelated issue) for different 
> component couple of years ago 
> https://issues.apache.org/jira/browse/HADOOP-8635 . 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-15022) s3guard IT tests increase R/W capacity of the test table by 1

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-15022.
-
Resolution: Duplicate

> s3guard IT tests increase R/W capacity of the test table by 1
> -
>
> Key: HADOOP-15022
> URL: https://issues.apache.org/jira/browse/HADOOP-15022
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.0.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>
> Just noticed playing with the CLI that my allocated IOPs was 153; reset it to 
> 10 R & 10 W; after a few of the IT test runs it is now 13 each
> assumption: every test run of the S3Guard CLI is increasing the allocated IO



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15658) Memory leak in S3AOutputStream

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15658:

Issue Type: Sub-task  (was: Bug)
Parent: HADOOP-15620

> Memory leak in S3AOutputStream
> --
>
> Key: HADOOP-15658
> URL: https://issues.apache.org/jira/browse/HADOOP-15658
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.4
>Reporter: Piotr Nowojski
>Priority: Major
>
> S3AOutputStream by calling 
> {{org.apache.hadoop.fs.s3a.S3AFileSystem#createTmpFileForWrite}} indirectly 
> calls {{java.io.File#deleteOnExit}} and {{java.io.DeleteOnExitHook}} which 
> are known for memory leaking:
>   
>  [https://bugs.java.com/view_bug.do?bug_id=6664633]
>  [https://bugs.java.com/view_bug.do?bug_id=4872014]
>   
>  Apparently it was even fixed (same bug but in unrelated issue) for different 
> component couple of years ago 
> https://issues.apache.org/jira/browse/HADOOP-8635 . 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15226) Über-JIRA: S3Guard Phase III: Hadoop 3.2 features

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596665#comment-16596665
 ] 

Steve Loughran commented on HADOOP-15226:
-

For people watching this: just moved all the oustanding stuff which isn't ready 
for review into the 3.3 uber JIRA

> Über-JIRA: S3Guard Phase III: Hadoop 3.2 features
> -
>
> Key: HADOOP-15226
> URL: https://issues.apache.org/jira/browse/HADOOP-15226
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Steve Loughran
>Priority: Major
>
> S3Guard features/improvements/fixes for Hadoop 3.2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15667) FileSystemMultipartUploader should verify that UploadHandle has non-0 length

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15667:

Issue Type: Sub-task  (was: Bug)
Parent: HADOOP-15620

> FileSystemMultipartUploader should verify that UploadHandle has non-0 length
> 
>
> Key: HADOOP-15667
> URL: https://issues.apache.org/jira/browse/HADOOP-15667
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HADOOP-15667.001.patch
>
>
> The S3AMultipartUploader has a good check on the length of the UploadHandle. 
> This should be moved to MultipartUploader, made protected, and called in the 
> various implementations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15667) FileSystemMultipartUploader should verify that UploadHandle has non-0 length

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15667:

Affects Version/s: 3.2.0

> FileSystemMultipartUploader should verify that UploadHandle has non-0 length
> 
>
> Key: HADOOP-15667
> URL: https://issues.apache.org/jira/browse/HADOOP-15667
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HADOOP-15667.001.patch
>
>
> The S3AMultipartUploader has a good check on the length of the UploadHandle. 
> This should be moved to MultipartUploader, made protected, and called in the 
> various implementations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15667) FileSystemMultipartUploader should verify that UploadHandle has non-0 length

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15667:

Component/s: fs/s3

> FileSystemMultipartUploader should verify that UploadHandle has non-0 length
> 
>
> Key: HADOOP-15667
> URL: https://issues.apache.org/jira/browse/HADOOP-15667
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HADOOP-15667.001.patch
>
>
> The S3AMultipartUploader has a good check on the length of the UploadHandle. 
> This should be moved to MultipartUploader, made protected, and called in the 
> various implementations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15426) Make S3guard client resilient to DDB throttle events and network failures

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15426:

Attachment: HADOOP-15426-007.patch

> Make S3guard client resilient to DDB throttle events and network failures
> -
>
> Key: HADOOP-15426
> URL: https://issues.apache.org/jira/browse/HADOOP-15426
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15426-001.patch, HADOOP-15426-002.patch, 
> HADOOP-15426-003.patch, HADOOP-15426-004.patch, HADOOP-15426-005.patch, 
> HADOOP-15426-006.patch, HADOOP-15426-007.patch, Screen Shot 2018-07-24 at 
> 15.16.46.png, Screen Shot 2018-07-25 at 16.22.10.png, Screen Shot 2018-07-25 
> at 16.28.53.png, Screen Shot 2018-07-27 at 14.07.38.png, 
> org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale-output.txt
>
>
> managed to create on a parallel test run
> {code}
> org.apache.hadoop.fs.s3a.AWSServiceThrottledException: delete on 
> s3a://hwdev-steve-ireland-new/fork-0005/test/existing-dir/existing-file: 
> com.amazonaws.services.dynamodbv2.model.ProvisionedThroughputExceededException:
>  The level of configured provisioned throughput for the table was exceeded. 
> Consider increasing your provisioning level with the UpdateTable API. 
> (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: 
> ProvisionedThroughputExceededException; Request ID: 
> RDM3370REDBBJQ0SLCLOFC8G43VV4KQNSO5AEMVJF66Q9ASUAAJG): The level of 
> configured provisioned throughput for the table was exceeded. Consider 
> increasing your provisioning level with the UpdateTable API. (Service: 
> AmazonDynamoDBv2; Status Code: 400; Error Code: 
> ProvisionedThroughputExceededException; Request ID: 
> RDM3370REDBBJQ0SLCLOFC8G43VV4KQNSO5AEMVJF66Q9ASUAAJG)
>   at 
> {code}
> We should be able to handle this. 400 "bad things happened" error though, not 
> the 503 from S3.
> h3. We need a retry handler for DDB throttle operations



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15426) Make S3guard client resilient to DDB throttle events and network failures

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15426:

Status: Patch Available  (was: Open)

> Make S3guard client resilient to DDB throttle events and network failures
> -
>
> Key: HADOOP-15426
> URL: https://issues.apache.org/jira/browse/HADOOP-15426
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15426-001.patch, HADOOP-15426-002.patch, 
> HADOOP-15426-003.patch, HADOOP-15426-004.patch, HADOOP-15426-005.patch, 
> HADOOP-15426-006.patch, HADOOP-15426-007.patch, Screen Shot 2018-07-24 at 
> 15.16.46.png, Screen Shot 2018-07-25 at 16.22.10.png, Screen Shot 2018-07-25 
> at 16.28.53.png, Screen Shot 2018-07-27 at 14.07.38.png, 
> org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale-output.txt
>
>
> managed to create on a parallel test run
> {code}
> org.apache.hadoop.fs.s3a.AWSServiceThrottledException: delete on 
> s3a://hwdev-steve-ireland-new/fork-0005/test/existing-dir/existing-file: 
> com.amazonaws.services.dynamodbv2.model.ProvisionedThroughputExceededException:
>  The level of configured provisioned throughput for the table was exceeded. 
> Consider increasing your provisioning level with the UpdateTable API. 
> (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: 
> ProvisionedThroughputExceededException; Request ID: 
> RDM3370REDBBJQ0SLCLOFC8G43VV4KQNSO5AEMVJF66Q9ASUAAJG): The level of 
> configured provisioned throughput for the table was exceeded. Consider 
> increasing your provisioning level with the UpdateTable API. (Service: 
> AmazonDynamoDBv2; Status Code: 400; Error Code: 
> ProvisionedThroughputExceededException; Request ID: 
> RDM3370REDBBJQ0SLCLOFC8G43VV4KQNSO5AEMVJF66Q9ASUAAJG)
>   at 
> {code}
> We should be able to handle this. 400 "bad things happened" error though, not 
> the 503 from S3.
> h3. We need a retry handler for DDB throttle operations



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15426) Make S3guard client resilient to DDB throttle events and network failures

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15426:

Status: Open  (was: Patch Available)

> Make S3guard client resilient to DDB throttle events and network failures
> -
>
> Key: HADOOP-15426
> URL: https://issues.apache.org/jira/browse/HADOOP-15426
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15426-001.patch, HADOOP-15426-002.patch, 
> HADOOP-15426-003.patch, HADOOP-15426-004.patch, HADOOP-15426-005.patch, 
> HADOOP-15426-006.patch, HADOOP-15426-007.patch, Screen Shot 2018-07-24 at 
> 15.16.46.png, Screen Shot 2018-07-25 at 16.22.10.png, Screen Shot 2018-07-25 
> at 16.28.53.png, Screen Shot 2018-07-27 at 14.07.38.png, 
> org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale-output.txt
>
>
> managed to create on a parallel test run
> {code}
> org.apache.hadoop.fs.s3a.AWSServiceThrottledException: delete on 
> s3a://hwdev-steve-ireland-new/fork-0005/test/existing-dir/existing-file: 
> com.amazonaws.services.dynamodbv2.model.ProvisionedThroughputExceededException:
>  The level of configured provisioned throughput for the table was exceeded. 
> Consider increasing your provisioning level with the UpdateTable API. 
> (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: 
> ProvisionedThroughputExceededException; Request ID: 
> RDM3370REDBBJQ0SLCLOFC8G43VV4KQNSO5AEMVJF66Q9ASUAAJG): The level of 
> configured provisioned throughput for the table was exceeded. Consider 
> increasing your provisioning level with the UpdateTable API. (Service: 
> AmazonDynamoDBv2; Status Code: 400; Error Code: 
> ProvisionedThroughputExceededException; Request ID: 
> RDM3370REDBBJQ0SLCLOFC8G43VV4KQNSO5AEMVJF66Q9ASUAAJG)
>   at 
> {code}
> We should be able to handle this. 400 "bad things happened" error though, not 
> the 503 from S3.
> h3. We need a retry handler for DDB throttle operations



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14159) Add some Java-8 friendly way to work with RemoteIterable, especially listings

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14159:

Priority: Minor  (was: Major)

> Add some Java-8 friendly way to work with RemoteIterable, especially listings
> -
>
> Key: HADOOP-14159
> URL: https://issues.apache.org/jira/browse/HADOOP-14159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.0.0-alpha2
>Reporter: Steve Loughran
>Priority: Minor
>
> There's a fair amount of Hadoop code which uses {{FileSystem.listStatus(path) 
> }} just to get an {{FileStatus[]}} array which they can then iterate over in 
> a {{for}} loop.
> This is inefficient and scales badly, as the entire listing is done before 
> the compute; it cannot handle directories with millions of entries. 
> The listLocatedStatus() calls return a RemoteIterator class, which can't be 
> used in for loops as it has the right to throw an IOE in any hasNext/next 
> call. That doesn't matter, as we now have closures and simple stream 
> operations.
> {code}
>  listLocatedStatus(path).filter((st) -> st.length > 0).apply(st -> 
> fs.delete(st.path))}}
> {code}
> See? We could do shiny new closure things. It wouldn't necessarily need 
> changes to FileSystem either, just something which took {{RemoteIterator}} 
> and let you chain some closures off it, similar to the java 8 streams 
> operations.
> Once implemented, we can move to using it in the Hadoop code wherever we  use 
> listFiles() today



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14630) Contract Tests to verify create, mkdirs and rename under a file is forbidden

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14630:

Status: Patch Available  (was: Open)

patch 004; rebased for trunk

> Contract Tests to verify create, mkdirs and rename under a file is forbidden
> 
>
> Key: HADOOP-14630
> URL: https://issues.apache.org/jira/browse/HADOOP-14630
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/azure, fs/s3, fs/swift
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-14630-001.patch, HADOOP-14630-002.patch, 
> HADOOP-14630-003.patch, HADOOP-14630-004.patch
>
>
> Object stores can get into trouble in ways which an FS would never, do, ways 
> so obvious we've never done tests for them. We know what the problems are: 
> test for file and dir creation directly/indirectly under other files
> * mkdir(file/file)
> * mkdir(file/subdir)
> * dir under file/subdir/subdir
> * dir/dir2/file, verify dir & dir2 exist
> * dir/dir2/dir3, verify dir & dir2 exist 
> * rename(src, file/dest)
> * rename(src, file/dir/dest)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14630) Contract Tests to verify create, mkdirs and rename under a file is forbidden

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14630:

Attachment: HADOOP-14630-004.patch

> Contract Tests to verify create, mkdirs and rename under a file is forbidden
> 
>
> Key: HADOOP-14630
> URL: https://issues.apache.org/jira/browse/HADOOP-14630
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/azure, fs/s3, fs/swift
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-14630-001.patch, HADOOP-14630-002.patch, 
> HADOOP-14630-003.patch, HADOOP-14630-004.patch
>
>
> Object stores can get into trouble in ways which an FS would never, do, ways 
> so obvious we've never done tests for them. We know what the problems are: 
> test for file and dir creation directly/indirectly under other files
> * mkdir(file/file)
> * mkdir(file/subdir)
> * dir under file/subdir/subdir
> * dir/dir2/file, verify dir & dir2 exist
> * dir/dir2/dir3, verify dir & dir2 exist 
> * rename(src, file/dest)
> * rename(src, file/dir/dest)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14630) Contract Tests to verify create, mkdirs and rename under a file is forbidden

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14630:

Target Version/s:   (was: 3.2.0)

> Contract Tests to verify create, mkdirs and rename under a file is forbidden
> 
>
> Key: HADOOP-14630
> URL: https://issues.apache.org/jira/browse/HADOOP-14630
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/azure, fs/s3, fs/swift
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-14630-001.patch, HADOOP-14630-002.patch, 
> HADOOP-14630-003.patch
>
>
> Object stores can get into trouble in ways which an FS would never, do, ways 
> so obvious we've never done tests for them. We know what the problems are: 
> test for file and dir creation directly/indirectly under other files
> * mkdir(file/file)
> * mkdir(file/subdir)
> * dir under file/subdir/subdir
> * dir/dir2/file, verify dir & dir2 exist
> * dir/dir2/dir3, verify dir & dir2 exist 
> * rename(src, file/dest)
> * rename(src, file/dir/dest)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15628) S3A Filesystem does not check return from AmazonS3Client deleteObjects

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15628:

Issue Type: Sub-task  (was: Bug)
Parent: HADOOP-15220

> S3A Filesystem does not check return from AmazonS3Client deleteObjects
> --
>
> Key: HADOOP-15628
> URL: https://issues.apache.org/jira/browse/HADOOP-15628
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.1, 2.8.4, 3.1.1, 3.0.3
> Environment: Hadoop 3.0.2 / Hadoop 2.8.3
> Hive 2.3.2 / Hive 2.3.3 / Hive 3.0.0
> Non-AWS S3 implementation
>Reporter: Steve Jacobs
>Priority: Minor
>
> Deletes in S3A that use the Multi-Delete functionality in the Amazon S3 api 
> do not check to see if all objects have been succesfully delete. In the event 
> of a failure, the api will still return a 200 OK (which isn't checked 
> currently):
> [Delete Code from Hadoop 
> 2.8|https://github.com/apache/hadoop/blob/a0da1ec01051108b77f86799dd5e97563b2a3962/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L574]
>  
> {code:java}
> if (keysToDelete.size() == MAX_ENTRIES_TO_DELETE) {
> DeleteObjectsRequest deleteRequest =
> new DeleteObjectsRequest(bucket).withKeys(keysToDelete);
> s3.deleteObjects(deleteRequest);
> statistics.incrementWriteOps(1);
> keysToDelete.clear();
> }
> {code}
> This should be converted to use the DeleteObjectsResult class from the 
> S3Client: 
> [Amazon Code 
> Example|https://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingMultipleObjectsUsingJava.htm]
> {code:java}
> // Verify that the objects were deleted successfully.
> DeleteObjectsResult delObjRes = 
> s3Client.deleteObjects(multiObjectDeleteRequest); int successfulDeletes = 
> delObjRes.getDeletedObjects().size();
> System.out.println(successfulDeletes + " objects successfully deleted.");
> {code}
> Bucket policies can be misconfigured, and deletes will fail without warning 
> by S3A clients.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13278) S3AFileSystem mkdirs does not need to validate parent path components

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596636#comment-16596636
 ] 

Steve Loughran commented on HADOOP-13278:
-

Moving to branch-3.3

As noted, we do want to check up the path, so the current PR isn't going to 
work. The one thing we can do is handle permissions 

# during that walk up the tree an {{AccessDeniedException}} is raised, that can 
be caught and used to indicate that  "you can't do anything up there", and the 
mkdirs simply assumes that all is good.
# will need a test in org.apache.hadoop.fs.s3a.auth.ITestAssumeRole which 
creates a role with the restricted permissions (skipped if s3guard is enabled, 
BTW), and then verifies that the mkdirs(/a/b/c) fails even as 
getFileStatus("a") fails because A is blocked. 

Not got time to work on this; postponing to 3.3+, contributions *with that 
test* welcome.

> S3AFileSystem mkdirs does not need to validate parent path components
> -
>
> Key: HADOOP-13278
> URL: https://issues.apache.org/jira/browse/HADOOP-13278
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3, tools
>Reporter: Adrian Petrescu
>Priority: Minor
>
> According to S3 semantics, there is no conflict if a bucket contains a key 
> named {{a/b}} and also a directory named {{a/b/c}}. "Directories" in S3 are, 
> after all, nothing but prefixes.
> However, the {{mkdirs}} call in {{S3AFileSystem}} does go out of its way to 
> traverse every parent path component for the directory it's trying to create, 
> making sure there's no file with that name. This is suboptimal for three main 
> reasons:
>  * Wasted API calls, since the client is getting metadata for each path 
> component 
>  * This can cause *major* problems with buckets whose permissions are being 
> managed by IAM, where access may not be granted to the root bucket, but only 
> to some prefix. When you call {{mkdirs}}, even on a prefix that you have 
> access to, the traversal up the path will cause you to eventually hit the 
> root bucket, which will fail with a 403 - even though the directory creation 
> call would have succeeded.
>  * Some people might actually have a file that matches some other file's 
> prefix... I can't see why they would want to do that, but it's not against 
> S3's rules.
> I've opened a pull request with a simple patch that just removes this portion 
> of the check. I have tested it with my team's instance of Spark + Luigi, and 
> can confirm it works, and resolves the aforementioned permissions issue for a 
> bucket on which we only had prefix access.
> This is my first ticket/pull request against Hadoop, so let me know if I'm 
> not following some convention properly :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14483) increase default value of fs.s3a.multipart.size to 128M

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-14483.
-
Resolution: Won't Fix

> increase default value of fs.s3a.multipart.size to 128M
> ---
>
> Key: HADOOP-14483
> URL: https://issues.apache.org/jira/browse/HADOOP-14483
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Minor
>
> increment the default value of {{fs.s3a.multipart.size}} from "100M" to 
> "128M".
> Why? AWS S3 throttles clients making too many requests; going to a larger 
> size will reduce this. Also: document the issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14483) increase default value of fs.s3a.multipart.size to 128M

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14483:

Parent Issue: HADOOP-15220  (was: HADOOP-15620)

> increase default value of fs.s3a.multipart.size to 128M
> ---
>
> Key: HADOOP-14483
> URL: https://issues.apache.org/jira/browse/HADOOP-14483
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Minor
>
> increment the default value of {{fs.s3a.multipart.size}} from "100M" to 
> "128M".
> Why? AWS S3 throttles clients making too many requests; going to a larger 
> size will reduce this. Also: document the issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14483) increase default value of fs.s3a.multipart.size to 128M

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596621#comment-16596621
 ] 

Steve Loughran commented on HADOOP-14483:
-

I'm going to close as WONTFIX. There are some subtle changes if you do expand 
the size, particularly the time to close() a file can increase, as you can have 
(128 x 2^20 - 1) bytes waiting for upload. Safest to leave as is and let people 
tune

> increase default value of fs.s3a.multipart.size to 128M
> ---
>
> Key: HADOOP-14483
> URL: https://issues.apache.org/jira/browse/HADOOP-14483
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Minor
>
> increment the default value of {{fs.s3a.multipart.size}} from "100M" to 
> "128M".
> Why? AWS S3 throttles clients making too many requests; going to a larger 
> size will reduce this. Also: document the issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14483) increase default value of fs.s3a.multipart.size to 128M

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14483:

Target Version/s:   (was: 3.3.0)

> increase default value of fs.s3a.multipart.size to 128M
> ---
>
> Key: HADOOP-14483
> URL: https://issues.apache.org/jira/browse/HADOOP-14483
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Minor
>
> increment the default value of {{fs.s3a.multipart.size}} from "100M" to 
> "128M".
> Why? AWS S3 throttles clients making too many requests; going to a larger 
> size will reduce this. Also: document the issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15107) Stabilize/tune S3A committers; review correctness & docs

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596618#comment-16596618
 ] 

Steve Loughran commented on HADOOP-15107:
-

 I want this to go in to Hadoop 3.2; there's no significant change in semantics 
here other than better resilience, error reporting and a quieter abort phase.

Can I get some reviews? This is a great opportunity to learn about the commit 
mechanism

> Stabilize/tune S3A committers; review correctness & docs
> 
>
> Key: HADOOP-15107
> URL: https://issues.apache.org/jira/browse/HADOOP-15107
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15107-001.patch, HADOOP-15107-002.patch, 
> HADOOP-15107-003.patch, HADOOP-15107-004.patch
>
>
> I'm writing about the paper on the committers, one which, being a proper 
> paper, requires me to show the committers work.
> # define the requirements of a "Correct" committed job (this applies to the 
> FileOutputCommitter too)
> # show that the Staging committer meets these requirements (most of this is 
> implicit in that it uses the V1 FileOutputCommitter to marshall .pendingset 
> lists from committed tasks to the final destination, where they are read and 
> committed.
> # Show the magic committer also works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15107) Stabilize/tune S3A committers; review correctness & docs

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15107:

Priority: Blocker  (was: Major)

> Stabilize/tune S3A committers; review correctness & docs
> 
>
> Key: HADOOP-15107
> URL: https://issues.apache.org/jira/browse/HADOOP-15107
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15107-001.patch, HADOOP-15107-002.patch, 
> HADOOP-15107-003.patch, HADOOP-15107-004.patch
>
>
> I'm writing about the paper on the committers, one which, being a proper 
> paper, requires me to show the committers work.
> # define the requirements of a "Correct" committed job (this applies to the 
> FileOutputCommitter too)
> # show that the Staging committer meets these requirements (most of this is 
> implicit in that it uses the V1 FileOutputCommitter to marshall .pendingset 
> lists from committed tasks to the final destination, where they are read and 
> committed.
> # Show the magic committer also works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14833) Remove s3a user:secret authentication

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14833:

Target Version/s:   (was: 3.2.0)

> Remove s3a user:secret authentication
> -
>
> Key: HADOOP-14833
> URL: https://issues.apache.org/jira/browse/HADOOP-14833
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-14833-001.patch
>
>
> Remove the s3a://user:secret@host auth mechanism from S3a. 
> As well as being insecure, it causes problems with S3Guard's URI matching 
> code.
> Proposed: cull it utterly. We've been telling people to stop using it since 
> HADOOP-3733



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14468) S3Guard: make short-circuit getFileStatus() configurable

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14468:

Parent Issue: HADOOP-15619  (was: HADOOP-15226)

> S3Guard: make short-circuit getFileStatus() configurable
> 
>
> Key: HADOOP-14468
> URL: https://issues.apache.org/jira/browse/HADOOP-14468
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>Priority: Minor
>
> Currently, when S3Guard is enabled, getFileStatus() will skip S3 if it gets a 
> result from the MetadataStore (e.g. dynamodb) first.
> I would like to add a new parameter 
> {{fs.s3a.metadatastore.getfilestatus.authoritative}} which, when true, keeps 
> the current behavior.  When false, S3AFileSystem will check both S3 and the 
> MetadataStore.
> I'm not sure yet if we want to have this behavior the same for all callers of 
> getFileStatus(), or if we only want to check both S3 and MetadataStore for 
> some internal callers such as open().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15183) S3Guard store becomes inconsistent after partial failure of rename

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15183:

Target Version/s:   (was: 3.2.0)

> S3Guard store becomes inconsistent after partial failure of rename
> --
>
> Key: HADOOP-15183
> URL: https://issues.apache.org/jira/browse/HADOOP-15183
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0
>Reporter: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15183-001.patch, HADOOP-15183-002.patch, 
> org.apache.hadoop.fs.s3a.auth.ITestAssumeRole-output.txt
>
>
> If an S3A rename() operation fails partway through, such as when the user 
> doesn't have permissions to delete the source files after copying to the 
> destination, then the s3guard view of the world ends up inconsistent. In 
> particular the sequence
>  (assuming src/file* is a list of files file1...file10 and read only to 
> caller)
>
> # create file rename src/file1 dest/ ; expect AccessDeniedException in the 
> delete, dest/file1 will exist
> # delete file dest/file1
> # rename src/file* dest/  ; expect failure
> # list dest; you will not see dest/file1
> You will not see file1 in the listing, presumably because it will have a 
> tombstone marker and the update at the end of the rename() didn't take place: 
> the old data is still there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15183) S3Guard store becomes inconsistent after partial failure of rename

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15183:

Parent Issue: HADOOP-15619  (was: HADOOP-15226)

> S3Guard store becomes inconsistent after partial failure of rename
> --
>
> Key: HADOOP-15183
> URL: https://issues.apache.org/jira/browse/HADOOP-15183
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0
>Reporter: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15183-001.patch, HADOOP-15183-002.patch, 
> org.apache.hadoop.fs.s3a.auth.ITestAssumeRole-output.txt
>
>
> If an S3A rename() operation fails partway through, such as when the user 
> doesn't have permissions to delete the source files after copying to the 
> destination, then the s3guard view of the world ends up inconsistent. In 
> particular the sequence
>  (assuming src/file* is a list of files file1...file10 and read only to 
> caller)
>
> # create file rename src/file1 dest/ ; expect AccessDeniedException in the 
> delete, dest/file1 will exist
> # delete file dest/file1
> # rename src/file* dest/  ; expect failure
> # list dest; you will not see dest/file1
> You will not see file1 in the listing, presumably because it will have a 
> tombstone marker and the update at the end of the rename() didn't take place: 
> the old data is still there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13936) S3Guard: DynamoDB can go out of sync with S3AFileSystem::delete operation

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13936:

Target Version/s:   (was: 3.2.0)

> S3Guard: DynamoDB can go out of sync with S3AFileSystem::delete operation
> -
>
> Key: HADOOP-13936
> URL: https://issues.apache.org/jira/browse/HADOOP-13936
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1, 3.1.0, 3.1.1
>Reporter: Rajesh Balamohan
>Assignee: Steve Loughran
>Priority: Blocker
>
> As a part of {{S3AFileSystem.delete}} operation {{innerDelete}} is invoked, 
> which deletes keys from S3 in batches (default is 1000). But DynamoDB is 
> updated only at the end of this operation. This can cause issues when 
> deleting large number of keys. 
> E.g, it is possible to get exception after deleting 1000 keys and in such 
> cases dynamoDB would not be updated. This can cause DynamoDB to go out of 
> sync. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13936) S3Guard: DynamoDB can go out of sync with S3AFileSystem::delete operation

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13936:

Parent Issue: HADOOP-15619  (was: HADOOP-15226)

> S3Guard: DynamoDB can go out of sync with S3AFileSystem::delete operation
> -
>
> Key: HADOOP-13936
> URL: https://issues.apache.org/jira/browse/HADOOP-13936
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1, 3.1.0, 3.1.1
>Reporter: Rajesh Balamohan
>Assignee: Steve Loughran
>Priority: Blocker
>
> As a part of {{S3AFileSystem.delete}} operation {{innerDelete}} is invoked, 
> which deletes keys from S3 in batches (default is 1000). But DynamoDB is 
> updated only at the end of this operation. This can cause issues when 
> deleting large number of keys. 
> E.g, it is possible to get exception after deleting 1000 keys and in such 
> cases dynamoDB would not be updated. This can cause DynamoDB to go out of 
> sync. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15664) ABFS: Reduce test run time via parallelization and grouping

2018-08-29 Thread Da Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Da Zhou updated HADOOP-15664:
-
Status: Patch Available  (was: Open)

> ABFS: Reduce test run time via parallelization and grouping
> ---
>
> Key: HADOOP-15664
> URL: https://issues.apache.org/jira/browse/HADOOP-15664
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Da Zhou
>Priority: Major
> Attachments: HADOOP-15664-HADOOP-15407-001.patch, 
> HADOOP-15664-HADOOP-15407-002.patch
>
>
> 1) Let's reduce the total test runtime by improving parallelization of the 
> tests.
> 2) Let's make it possible to select WASB tests, ABFS tests, or both so 
> developers can run only the tests appropriate for the change they've made.
> 3) Update the testing-azure.md accordingly



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14158) Possible for modified configuration to leak into metadatastore in S3GuardTool

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14158:

Parent Issue: HADOOP-15619  (was: HADOOP-15226)

> Possible for modified configuration to leak into metadatastore in S3GuardTool
> -
>
> Key: HADOOP-14158
> URL: https://issues.apache.org/jira/browse/HADOOP-14158
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Sean Mackrory
>Priority: Minor
>
> It doesn't appear to do it when run from the command-line, but when running 
> the S3GuardTool.run (i.e. the parent function of most of the functions used 
> in the unit tests) from a unit test, you end up with a NullMetadataStore, 
> regardless of what else was configured.
> We create an instance of S3AFileSystem with the metadata store implementation 
> overridden to NullMetadataStore so that we have distinct interfaces to S3 and 
> the metadata store. S3Guard can later be called using this filesystem, 
> causing it to pick up the filesystem's configuration, which instructs it to 
> use the NullMetadataStore implementation. This shouldn't be possible.
> It is unknown if this happens in any real-world scenario - I've been unable 
> to reproduce the problem from the command-line. But it definitely happens in 
> a test, it shouldn't, and fixing this will at least allow HADOOP-14145 to 
> have an automated test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15664) ABFS: Reduce test run time via parallelization and grouping

2018-08-29 Thread Da Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596607#comment-16596607
 ] 

Da Zhou commented on HADOOP-15664:
--

Attaching HADOOP-15664-HADOOP-15407-002.patch:
  - Fixed import order in ITestNativeFileSystemStatistics

> ABFS: Reduce test run time via parallelization and grouping
> ---
>
> Key: HADOOP-15664
> URL: https://issues.apache.org/jira/browse/HADOOP-15664
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Da Zhou
>Priority: Major
> Attachments: HADOOP-15664-HADOOP-15407-001.patch, 
> HADOOP-15664-HADOOP-15407-002.patch
>
>
> 1) Let's reduce the total test runtime by improving parallelization of the 
> tests.
> 2) Let's make it possible to select WASB tests, ABFS tests, or both so 
> developers can run only the tests appropriate for the change they've made.
> 3) Update the testing-azure.md accordingly



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15664) ABFS: Reduce test run time via parallelization and grouping

2018-08-29 Thread Da Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Da Zhou updated HADOOP-15664:
-
Attachment: HADOOP-15664-HADOOP-15407-002.patch

> ABFS: Reduce test run time via parallelization and grouping
> ---
>
> Key: HADOOP-15664
> URL: https://issues.apache.org/jira/browse/HADOOP-15664
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Da Zhou
>Priority: Major
> Attachments: HADOOP-15664-HADOOP-15407-001.patch, 
> HADOOP-15664-HADOOP-15407-002.patch
>
>
> 1) Let's reduce the total test runtime by improving parallelization of the 
> tests.
> 2) Let's make it possible to select WASB tests, ABFS tests, or both so 
> developers can run only the tests appropriate for the change they've made.
> 3) Update the testing-azure.md accordingly



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15628) S3A Filesystem does not check return from AmazonS3Client deleteObjects

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596602#comment-16596602
 ] 

Steve Loughran commented on HADOOP-15628:
-

[~steveatbat]: if you can do a patch for this ASAP we can still get it in to 
3.2. I don't see how it can be tested, other than regression testing & review

> S3A Filesystem does not check return from AmazonS3Client deleteObjects
> --
>
> Key: HADOOP-15628
> URL: https://issues.apache.org/jira/browse/HADOOP-15628
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.9.1, 2.8.4, 3.1.1, 3.0.3
> Environment: Hadoop 3.0.2 / Hadoop 2.8.3
> Hive 2.3.2 / Hive 2.3.3 / Hive 3.0.0
> Non-AWS S3 implementation
>Reporter: Steve Jacobs
>Priority: Minor
>
> Deletes in S3A that use the Multi-Delete functionality in the Amazon S3 api 
> do not check to see if all objects have been succesfully delete. In the event 
> of a failure, the api will still return a 200 OK (which isn't checked 
> currently):
> [Delete Code from Hadoop 
> 2.8|https://github.com/apache/hadoop/blob/a0da1ec01051108b77f86799dd5e97563b2a3962/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L574]
>  
> {code:java}
> if (keysToDelete.size() == MAX_ENTRIES_TO_DELETE) {
> DeleteObjectsRequest deleteRequest =
> new DeleteObjectsRequest(bucket).withKeys(keysToDelete);
> s3.deleteObjects(deleteRequest);
> statistics.incrementWriteOps(1);
> keysToDelete.clear();
> }
> {code}
> This should be converted to use the DeleteObjectsResult class from the 
> S3Client: 
> [Amazon Code 
> Example|https://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingMultipleObjectsUsingJava.htm]
> {code:java}
> // Verify that the objects were deleted successfully.
> DeleteObjectsResult delObjRes = 
> s3Client.deleteObjects(multiObjectDeleteRequest); int successfulDeletes = 
> delObjRes.getDeletedObjects().size();
> System.out.println(successfulDeletes + " objects successfully deleted.");
> {code}
> Bucket policies can be misconfigured, and deletes will fail without warning 
> by S3A clients.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15244) s3guard uploads command to add a way to complete outstanding uploads

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15244:

Parent Issue: HADOOP-15220  (was: HADOOP-15620)

> s3guard uploads command to add a way to complete outstanding uploads
> 
>
> Key: HADOOP-15244
> URL: https://issues.apache.org/jira/browse/HADOOP-15244
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Minor
>
> The AWS API lets you not only list & cancel outstanding upload (as s3guard 
> uploads) does, but actually list the parts.
> We may be able to actually complete an outstanding upload through the CLI.
> What would that do? It'd let you restore all but the last block of any logs 
> being written to s3 where the app/VM failed before the upload was completed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15244) s3guard uploads command to add a way to complete outstanding uploads

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15244:

Target Version/s:   (was: 3.3.0)

> s3guard uploads command to add a way to complete outstanding uploads
> 
>
> Key: HADOOP-15244
> URL: https://issues.apache.org/jira/browse/HADOOP-15244
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Minor
>
> The AWS API lets you not only list & cancel outstanding upload (as s3guard 
> uploads) does, but actually list the parts.
> We may be able to actually complete an outstanding upload through the CLI.
> What would that do? It'd let you restore all but the last block of any logs 
> being written to s3 where the app/VM failed before the upload was completed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14833) Remove s3a user:secret authentication

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14833:

Parent: HADOOP-15620  (was: HADOOP-15220)

> Remove s3a user:secret authentication
> -
>
> Key: HADOOP-14833
> URL: https://issues.apache.org/jira/browse/HADOOP-14833
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-14833-001.patch
>
>
> Remove the s3a://user:secret@host auth mechanism from S3a. 
> As well as being insecure, it causes problems with S3Guard's URI matching 
> code.
> Proposed: cull it utterly. We've been telling people to stop using it since 
> HADOOP-3733



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15409) S3AFileSystem.verifyBucketExists to move to s3.doesBucketExistV2

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15409:

Parent: HADOOP-15620  (was: HADOOP-15220)

> S3AFileSystem.verifyBucketExists to move to s3.doesBucketExistV2
> 
>
> Key: HADOOP-15409
> URL: https://issues.apache.org/jira/browse/HADOOP-15409
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Blocker
>
> in S3AFileSystem.initialize(), we check for the bucket existing with 
> verifyBucketExists(), which calls s3.doesBucketExist(). But that doesn't 
> check for auth issues. 
> s3. doesBucketExistV2() does at least validate credentials, and should be 
> switched to. This will help things fail faster 
> See SPARK-24000



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15651) NPE in S3AInputStream.read() in ITestS3AInconsistency.testOpenFailOnRead

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15651:

Parent: HADOOP-15620  (was: HADOOP-15220)

> NPE in S3AInputStream.read() in ITestS3AInconsistency.testOpenFailOnRead
> 
>
> Key: HADOOP-15651
> URL: https://issues.apache.org/jira/browse/HADOOP-15651
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Priority: Major
> Attachments: org.apache.hadoop.fs.s3a.ITestS3AInconsistency-output.txt
>
>
> Test {{ITestS3AInconsistency.testOpenFailOnRead()}} raise an NPE in read(); 
> could only happen on that line if {{wrappedStream==null}}, which implies that 
> previous attempts to re-open the closed stream failed and that this didn't 
> trigger anything.
> Not seen this before, but given that the fault injection is random, it may be 
> that so far test runs have been *unlucky* and missed this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14833) Remove s3a user:secret authentication

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14833:

Priority: Major  (was: Blocker)

> Remove s3a user:secret authentication
> -
>
> Key: HADOOP-14833
> URL: https://issues.apache.org/jira/browse/HADOOP-14833
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-14833-001.patch
>
>
> Remove the s3a://user:secret@host auth mechanism from S3a. 
> As well as being insecure, it causes problems with S3Guard's URI matching 
> code.
> Proposed: cull it utterly. We've been telling people to stop using it since 
> HADOOP-3733



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15684) triggerActiveLogRoll stuck on dead name node, when ConnectTimeoutException happens.

2018-08-29 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HADOOP-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596546#comment-16596546
 ] 

Íñigo Goiri commented on HADOOP-15684:
--

We would need a review from somebody familiar with HA and the 2+ NNs code.
Anyone available for review?

[~trjianjianjiao], meanwhile do you mind checking the checkstyle warning and 
take a look at TestJournalNodeSync?
The other unit tests seem the typical ones.

> triggerActiveLogRoll stuck on dead name node, when ConnectTimeoutException 
> happens. 
> 
>
> Key: HADOOP-15684
> URL: https://issues.apache.org/jira/browse/HADOOP-15684
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 3.0.0-alpha1
>Reporter: Rong Tang
>Assignee: Rong Tang
>Priority: Critical
> Attachments: 
> 0001-RollEditLog-try-next-NN-when-exception-happens.patch, 
> HADOOP-15684.000.patch, HADOOP-15684.001.patch, HADOOP-15684.002.patch, 
> hadoop--rollingUpgrade-SourceMachine001.log
>
>
> When name node call triggerActiveLogRoll, and the cachedActiveProxy is a dead 
> name node, it will throws a ConnectTimeoutException, expected behavior is to 
> try next NN, but current logic doesn't do so, instead, it keeps trying the 
> dead, mistakenly take it as active.
>  
> 2018-08-17 10:02:12,001 WARN [Edit log tailer] 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a 
> roll of the active NN
> org.apache.hadoop.net.ConnectTimeoutException: Call From 
> SourceMachine001/SourceIP to001 TargetMachine001.ap.gbl:8020 failed on socket 
> timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 2 
> millis timeout 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$2.doWork(EditLogTailer.java:298)
>  
> C:\Users\rotang>ping TargetMachine001
> Pinging TargetMachine001[TargetIP001] with 32 bytes of data:
>  Request timed out.
>  Request timed out.
>  Request timed out.
>  Request timed out.
>  Attachment is a log file saying how it repeatedly retries a dead name node, 
> and a fix patch.
>  I replaced the actual machine name/ip as SourceMachine001/SourceIP001 and 
> TargetMachine001/TargetIP001.
>  
> How to Repro:
> In a good running NNs, take down the active NN (don't let it come back during 
> test), and then the stand by NNs will keep trying dead (old active) NN, 
> because it is the cached one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-08-29 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596532#comment-16596532
 ] 

Sunil Govindan commented on HADOOP-15430:
-

Thanks [~ste...@apache.org].

> hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard
> --
>
> Key: HADOOP-15430
> URL: https://issues.apache.org/jira/browse/HADOOP-15430
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HADOOP-15430-001.patch, HADOOP-15430-002.patch, 
> HADOOP-15430-003.patch
>
>
> if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
> ending in "/:. you get a DDB error "An AttributeValue may not contain an 
> empty string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15703) ABFS - Implement client-side throttling

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596527#comment-16596527
 ] 

Steve Loughran commented on HADOOP-15703:
-

+ it'd be good to get these metrics into the ABFS overall metrics and 
StorageStatistics, so we can then aggregate them and return them with job stats 
"your query took 300s but it was throttled  ... "

> ABFS - Implement client-side throttling 
> 
>
> Key: HADOOP-15703
> URL: https://issues.apache.org/jira/browse/HADOOP-15703
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Sneha Varma
>Assignee: Sneha Varma
>Priority: Major
> Attachments: HADOOP-15703-HADOOP-15407-001.patch
>
>
> Big data workloads frequently exceed the AzureBlobFS max ingress and egress 
> limits 
> (https://docs.microsoft.com/en-us/azure/storage/common/storage-scalability-targets).
>  For example, the max ingress limit for a GRS account in the United States is 
> currently 10 Gbps. When the limit is exceeded, the AzureBlobFS service fails 
> a percentage of incoming requests, and this causes the client to initiate the 
> retry policy. The retry policy delays requests by sleeping, but the sleep 
> duration is independent of the client throughput and account limit. This 
> results in low throughput, due to the high number of failed requests and 
> thrashing causes by the retry policy.
> To fix this, we introduce a client-side throttle which minimizes failed 
> requests and maximizes throughput. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-15703) ABFS - Implement client-side throttling

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-15703:
---

Assignee: Sneha Varma

> ABFS - Implement client-side throttling 
> 
>
> Key: HADOOP-15703
> URL: https://issues.apache.org/jira/browse/HADOOP-15703
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Sneha Varma
>Assignee: Sneha Varma
>Priority: Major
> Attachments: HADOOP-15703-HADOOP-15407-001.patch
>
>
> Big data workloads frequently exceed the AzureBlobFS max ingress and egress 
> limits 
> (https://docs.microsoft.com/en-us/azure/storage/common/storage-scalability-targets).
>  For example, the max ingress limit for a GRS account in the United States is 
> currently 10 Gbps. When the limit is exceeded, the AzureBlobFS service fails 
> a percentage of incoming requests, and this causes the client to initiate the 
> retry policy. The retry policy delays requests by sleeping, but the sleep 
> duration is independent of the client throughput and account limit. This 
> results in low throughput, due to the high number of failed requests and 
> thrashing causes by the retry policy.
> To fix this, we introduce a client-side throttle which minimizes failed 
> requests and maximizes throughput. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15430:

Priority: Minor  (was: Blocker)

> hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard
> --
>
> Key: HADOOP-15430
> URL: https://issues.apache.org/jira/browse/HADOOP-15430
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HADOOP-15430-001.patch, HADOOP-15430-002.patch, 
> HADOOP-15430-003.patch
>
>
> if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
> ending in "/:. you get a DDB error "An AttributeValue may not contain an 
> empty string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15430:

Status: Open  (was: Patch Available)

> hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard
> --
>
> Key: HADOOP-15430
> URL: https://issues.apache.org/jira/browse/HADOOP-15430
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HADOOP-15430-001.patch, HADOOP-15430-002.patch, 
> HADOOP-15430-003.patch
>
>
> if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
> ending in "/:. you get a DDB error "An AttributeValue may not contain an 
> empty string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15430:

Target Version/s:   (was: 3.2.0)

> hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard
> --
>
> Key: HADOOP-15430
> URL: https://issues.apache.org/jira/browse/HADOOP-15430
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HADOOP-15430-001.patch, HADOOP-15430-002.patch, 
> HADOOP-15430-003.patch
>
>
> if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
> ending in "/:. you get a DDB error "An AttributeValue may not contain an 
> empty string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596520#comment-16596520
 ] 

Steve Loughran commented on HADOOP-15430:
-

Lets downgrade this one; I can live without it. I do have some other blockers 
with patches which I do need in

> hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard
> --
>
> Key: HADOOP-15430
> URL: https://issues.apache.org/jira/browse/HADOOP-15430
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15430-001.patch, HADOOP-15430-002.patch, 
> HADOOP-15430-003.patch
>
>
> if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
> ending in "/:. you get a DDB error "An AttributeValue may not contain an 
> empty string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15602) Support SASL Rpc request handling in separate Handlers

2018-08-29 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596518#comment-16596518
 ] 

Vinayakumar B commented on HADOOP-15602:


Hi [~daryn] ,

Wondering whether you got chance to look at the changes?

> Support SASL Rpc request handling in separate Handlers 
> ---
>
> Key: HADOOP-15602
> URL: https://issues.apache.org/jira/browse/HADOOP-15602
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Major
> Attachments: HADOOP-15602.01.patch
>
>
> Right now, during RPC Connection establishment, all SASL requests are 
> considered as OutOfBand requests and handled within the same Reader thread.
> SASL handling involves authentication with Kerberos and SecretManagers(for 
> Token validation). During this time, Reader thread would be blocked, hence 
> blocking all the incoming RPC requests on other established connections. Some 
> secretManager impls require to communicate to external systems (ex: ZK) for 
> verification.
> SASL RPC handling in separate dedicated handlers, would enable Reader threads 
> to read RPC requests from established connections without blocking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15703) ABFS - Implement client-side throttling

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596519#comment-16596519
 ] 

Steve Loughran commented on HADOOP-15703:
-

Looks good, excluding what I'm about to say about the test.

* needs docs
* not needed for this patch, but know in future that you can use {@code 
ClientThrottlingAnalyzer} for easier 
insertion of code into the javadocs.

AbfsClientThrottlingAnalyzer 

* import ordering should be java.*, other, org.apache.*, import statics
* L141: can you put braces around the innermost bit of the equation; I'm not 
sure of the ordering of that? 
I think its 0 : (percentageConversionFactor * bytesFailed / (bytesFailed + 
bytesSuccessful); , but would like the braces for all to see.

AbfsClientThrottlingOperationDescriptor: add newline to EOF

TestAbfsClientThrottlingAnalyzer

* again, import ordering.

-1 to the test as is; it's going to be way too brittle, especially in parallel 
test runs, because of all its expectations that the durations of sleeps are 
less than an expected range. This is especially the case on overloaded systems, 
like the ASF Jenkins build VMs.


The throttling is testable without going into sleep() calls or relying on 
elapsed times.


# Use {{org.apache.hadoop.util.Timer}} for time, creating it in some protected 
method {{createTimer()}}

# have a subclass of AbfsClientThrottlingAnalyzer in the test suite which 
returns {{new FakeTimer())}} in its timer, and whose (subclassed) {{sleep()}} 
method simply increments that timer rather than sleeping. 



> ABFS - Implement client-side throttling 
> 
>
> Key: HADOOP-15703
> URL: https://issues.apache.org/jira/browse/HADOOP-15703
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Sneha Varma
>Priority: Major
> Attachments: HADOOP-15703-HADOOP-15407-001.patch
>
>
> Big data workloads frequently exceed the AzureBlobFS max ingress and egress 
> limits 
> (https://docs.microsoft.com/en-us/azure/storage/common/storage-scalability-targets).
>  For example, the max ingress limit for a GRS account in the United States is 
> currently 10 Gbps. When the limit is exceeded, the AzureBlobFS service fails 
> a percentage of incoming requests, and this causes the client to initiate the 
> retry policy. The retry policy delays requests by sleeping, but the sleep 
> duration is independent of the client throughput and account limit. This 
> results in low throughput, due to the high number of failed requests and 
> thrashing causes by the retry policy.
> To fix this, we introduce a client-side throttle which minimizes failed 
> requests and maximizes throughput. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15700) ABFS: Failure in OpenSSLProvider should fall back to JSSE

2018-08-29 Thread Vishwajeet Dusane (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596501#comment-16596501
 ] 

Vishwajeet Dusane commented on HADOOP-15700:


The scenario i am trying to address is ABFS client builds using JDK 1.7 and 
runs on Java 7. Wildfly OpenSSL, however, is a targeted for Java 8+.

On runtime, ABFS client initialization would fail 
with.{{UnsupportedClassVersionError exception}} This patch would allow ABFS 
client to fall back to JSSE SSL provider instead of blocking ABFS client usage.

Please suggest if HADOOP-15669 and HADOOP-15407 is targetted strictly Java 8+ 
only and backporting to 2.x version is not supported. In that case, this patch 
would not be required.

> ABFS: Failure in OpenSSLProvider should fall back to JSSE
> -
>
> Key: HADOOP-15700
> URL: https://issues.apache.org/jira/browse/HADOOP-15700
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Vishwajeet Dusane
>Assignee: Vishwajeet Dusane
>Priority: Major
> Attachments: HADOOP-15700-HADOOP-15407-01.patch
>
>
> Failure to {{OpenSSLProvider.register()}} should fall back to default JSSE 
> initialization. This is needed to support Java 7 in case the HADOOP-15669 is 
> back-ported to support Java7.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15676) Cleanup TestSSLHttpServer

2018-08-29 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596489#comment-16596489
 ] 

Szilard Nemeth commented on HADOOP-15676:
-

Hey [~zsiegl]!
Thanks for your comments.
Fixed the String's visibilities and removed the additional newline as well.
Please check the updated patch!

Thanks!

> Cleanup TestSSLHttpServer
> -
>
> Key: HADOOP-15676
> URL: https://issues.apache.org/jira/browse/HADOOP-15676
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 2.6.0
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: HADOOP-15676.001.patch, HADOOP-15676.002.patch
>
>
> This issue will fix: 
> * Several typos in this class
> * Code is not very well readable in some of the places.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15676) Cleanup TestSSLHttpServer

2018-08-29 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596488#comment-16596488
 ] 

Szilard Nemeth commented on HADOOP-15676:
-

Hey [~zsiegl]!
Thanks for your comments.
Fixed the String's visibilities and removed the additional newline as well.
Please check the updated patch!

Thanks!

> Cleanup TestSSLHttpServer
> -
>
> Key: HADOOP-15676
> URL: https://issues.apache.org/jira/browse/HADOOP-15676
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 2.6.0
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: HADOOP-15676.001.patch, HADOOP-15676.002.patch
>
>
> This issue will fix: 
> * Several typos in this class
> * Code is not very well readable in some of the places.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15676) Cleanup TestSSLHttpServer

2018-08-29 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated HADOOP-15676:

Attachment: HADOOP-15676.002.patch

> Cleanup TestSSLHttpServer
> -
>
> Key: HADOOP-15676
> URL: https://issues.apache.org/jira/browse/HADOOP-15676
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 2.6.0
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: HADOOP-15676.001.patch, HADOOP-15676.002.patch
>
>
> This issue will fix: 
> * Several typos in this class
> * Code is not very well readable in some of the places.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15700) ABFS: Failure in OpenSSLProvider should fall back to JSSE

2018-08-29 Thread Thomas Marquardt (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596468#comment-16596468
 ] 

Thomas Marquardt commented on HADOOP-15700:
---

[~vishwajeet.dusane], who is going to build this with JDK 1.8 and run it on 
Java 7? 

> ABFS: Failure in OpenSSLProvider should fall back to JSSE
> -
>
> Key: HADOOP-15700
> URL: https://issues.apache.org/jira/browse/HADOOP-15700
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Vishwajeet Dusane
>Assignee: Vishwajeet Dusane
>Priority: Major
> Attachments: HADOOP-15700-HADOOP-15407-01.patch
>
>
> Failure to {{OpenSSLProvider.register()}} should fall back to default JSSE 
> initialization. This is needed to support Java 7 in case the HADOOP-15669 is 
> back-ported to support Java7.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15664) ABFS: Reduce test run time via parallelization and grouping

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596445#comment-16596445
 ] 

Steve Loughran commented on HADOOP-15664:
-

This is a really great speedup! Everyone will cherish this; even when I'm not 
working on this code I like to do a full test run of things before a commit, so 
this is good for my life.

ITestNativeFileSystemStatistics: can you check the ordering of the imports?

other than that, +1 from me

> ABFS: Reduce test run time via parallelization and grouping
> ---
>
> Key: HADOOP-15664
> URL: https://issues.apache.org/jira/browse/HADOOP-15664
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Da Zhou
>Priority: Major
> Attachments: HADOOP-15664-HADOOP-15407-001.patch
>
>
> 1) Let's reduce the total test runtime by improving parallelization of the 
> tests.
> 2) Let's make it possible to select WASB tests, ABFS tests, or both so 
> developers can run only the tests appropriate for the change they've made.
> 3) Update the testing-azure.md accordingly



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15669) ABFS: Improve HTTPS Performance

2018-08-29 Thread Vishwajeet Dusane (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596443#comment-16596443
 ] 

Vishwajeet Dusane commented on HADOOP-15669:


Thank you for clarification [~ste...@apache.org]. In Azure Data Lake Gen 1, we 
had the same strategy to collect cluster details. However, the config is a 
custom and set by us during cluster creation. Please correct me, is there a 
standard config available which stores the app details?

> ABFS: Improve HTTPS Performance
> ---
>
> Key: HADOOP-15669
> URL: https://issues.apache.org/jira/browse/HADOOP-15669
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Vishwajeet Dusane
>Priority: Major
> Attachments: ABFS - Improve HTTPS Performance Over Java Based 
> Client.pdf, HADOOP-15669-HADOOP-15407-01.patch, 
> HADOOP-15669-HADOOP-15407-02.patch, HADOOP-15669-HADOOP-15407-03.patch, 
> HADOOP-15669-HADOOP-15407-04.patch
>
>
> We see approximately 50% worse throughput for ABFS over HTTPs vs HTTP.  Lets 
> perform a detailed measurement and see what can be done to improve throughput.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15663) ABFS: Simplify configuration

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596442#comment-16596442
 ] 

Steve Loughran commented on HADOOP-15663:
-

Looks good

# the testing doc is going to need updating to cover the new options
# and the main abfs doc on the endpoint and any other new options.

AbstractAbfsIntegrationTest
L210: typo in comment

AbfsTestUtils
* use SLF4J for logging

azure-test.xml
* remove the s.AbstractFileSystem.wasb.impl option: that should be in 
core-default.xml alread, where it's needed for production.
* and also fs.azure.scale.test.enabled; as -Dscale in test runs should turn 
that on

> ABFS: Simplify configuration
> 
>
> Key: HADOOP-15663
> URL: https://issues.apache.org/jira/browse/HADOOP-15663
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Da Zhou
>Priority: Major
> Attachments: HADOOP-15663-HADOOP-15407-001.patch, 
> HADOOP-15663-HADOOP-15407-002.patch, HADOOP-15663-HADOOP-15407-003.patch
>
>
> Configuration for WASB and ABFS is too complex.  The current approach is to 
> use four files for test configuration. 
> Both WASB and ABFS have basic test configuration which is committed to the 
> repo (azure-test.xml and azure-bfs-test.xml).  Currently these contain the 
> fs.AbstractFileSystem.[scheme].impl configuration, but otherwise are empty 
> except for an include reference to a file containing the endpoint 
> credentials. 
> Both WASB and ABFS have endpoint credential configuration files 
> (azure-auth-keys.xml and azure-bfs-auth-keys.xml).  These have been added to 
> .gitignore to prevent them from accidentally being submitted in a patch, 
> which would leak the developers storage account credentials.  These files 
> contain account names, storage account keys, and service endpoints.
> There is some overlap of the configuration for WASB and ABFS, where they use 
> the same property name but use different values.  
> 1) Let's reduce the number of test configuration files to one, if possible.
> 2) Let's simplify the account name, key, and endpoint configuration for WASB 
> and ABFS if possible, but still support the legacy way of doing it, which is 
> very error prone.
> 3) Let's improve error handling, so that typos or misconfiguration are not so 
> difficult to troubleshoot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15700) ABFS: Failure in OpenSSLProvider should fall back to JSSE

2018-08-29 Thread Vishwajeet Dusane (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596439#comment-16596439
 ] 

Vishwajeet Dusane commented on HADOOP-15700:


Precisely the scenario i was trying to address with this patch which introduced 
by HADOOP-15669. i.e. ABFS client builds and runs on Java 7 but the SSL Lib 
used is Java8. This patch would address that issue.

> ABFS: Failure in OpenSSLProvider should fall back to JSSE
> -
>
> Key: HADOOP-15700
> URL: https://issues.apache.org/jira/browse/HADOOP-15700
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Vishwajeet Dusane
>Assignee: Vishwajeet Dusane
>Priority: Major
> Attachments: HADOOP-15700-HADOOP-15407-01.patch
>
>
> Failure to {{OpenSSLProvider.register()}} should fall back to default JSSE 
> initialization. This is needed to support Java 7 in case the HADOOP-15669 is 
> back-ported to support Java7.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15702) ABFS: Increase timeout of ITestAbfsReadWriteAndSeek

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596437#comment-16596437
 ] 

Steve Loughran commented on HADOOP-15702:
-

sounds good; all to easy to write test which work well in-cloud-infra but time 
out on remote runs. Though HADOOP-15426 shows its possible to do the opposite

> ABFS: Increase timeout of ITestAbfsReadWriteAndSeek
> ---
>
> Key: HADOOP-15702
> URL: https://issues.apache.org/jira/browse/HADOOP-15702
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure, test
>Affects Versions: HADOOP-15407
>Reporter: Sean Mackrory
>Priority: Major
>
> ITestAbfsReadWriteAndSeek.testReadAndWriteWithDifferentBufferSizesAndSeek 
> fails for me all the time. Let's increase the timout limit.
> It also seems to get executed twice...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15702) ABFS: Increase timeout of ITestAbfsReadWriteAndSeek

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15702:

Component/s: fs/azure

> ABFS: Increase timeout of ITestAbfsReadWriteAndSeek
> ---
>
> Key: HADOOP-15702
> URL: https://issues.apache.org/jira/browse/HADOOP-15702
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure, test
>Affects Versions: HADOOP-15407
>Reporter: Sean Mackrory
>Priority: Major
>
> ITestAbfsReadWriteAndSeek.testReadAndWriteWithDifferentBufferSizesAndSeek 
> fails for me all the time. Let's increase the timout limit.
> It also seems to get executed twice...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15702) ABFS: Increase timeout of ITestAbfsReadWriteAndSeek

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15702:

Component/s: test

> ABFS: Increase timeout of ITestAbfsReadWriteAndSeek
> ---
>
> Key: HADOOP-15702
> URL: https://issues.apache.org/jira/browse/HADOOP-15702
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure, test
>Affects Versions: HADOOP-15407
>Reporter: Sean Mackrory
>Priority: Major
>
> ITestAbfsReadWriteAndSeek.testReadAndWriteWithDifferentBufferSizesAndSeek 
> fails for me all the time. Let's increase the timout limit.
> It also seems to get executed twice...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15702) ABFS: Increase timeout of ITestAbfsReadWriteAndSeek

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15702:

Affects Version/s: HADOOP-15407

> ABFS: Increase timeout of ITestAbfsReadWriteAndSeek
> ---
>
> Key: HADOOP-15702
> URL: https://issues.apache.org/jira/browse/HADOOP-15702
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure, test
>Affects Versions: HADOOP-15407
>Reporter: Sean Mackrory
>Priority: Major
>
> ITestAbfsReadWriteAndSeek.testReadAndWriteWithDifferentBufferSizesAndSeek 
> fails for me all the time. Let's increase the timout limit.
> It also seems to get executed twice...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15669) ABFS: Improve HTTPS Performance

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596434#comment-16596434
 ] 

Steve Loughran commented on HADOOP-15669:
-

(BTW: The other stores generally offer a config option to add a custom string 
after the hadoop version info. Why? Lets you actually register a specific app 
"hbase-production-cluster" and make even more sense of the logs)

> ABFS: Improve HTTPS Performance
> ---
>
> Key: HADOOP-15669
> URL: https://issues.apache.org/jira/browse/HADOOP-15669
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Vishwajeet Dusane
>Priority: Major
> Attachments: ABFS - Improve HTTPS Performance Over Java Based 
> Client.pdf, HADOOP-15669-HADOOP-15407-01.patch, 
> HADOOP-15669-HADOOP-15407-02.patch, HADOOP-15669-HADOOP-15407-03.patch, 
> HADOOP-15669-HADOOP-15407-04.patch
>
>
> We see approximately 50% worse throughput for ABFS over HTTPs vs HTTP.  Lets 
> perform a detailed measurement and see what can be done to improve throughput.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15669) ABFS: Improve HTTPS Performance

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596433#comment-16596433
 ] 

Steve Loughran commented on HADOOP-15669:
-

bq. org.apache.hadoop.util.VersionInfo.getBuildVersion() return user as well 
which may lead to privacy issue?

not really, as the ASF releases are now generally done via docker containers 
with a user like "hadoop" and the private/commercial derivatives get to choose 
their own release user too.

> ABFS: Improve HTTPS Performance
> ---
>
> Key: HADOOP-15669
> URL: https://issues.apache.org/jira/browse/HADOOP-15669
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Thomas Marquardt
>Assignee: Vishwajeet Dusane
>Priority: Major
> Attachments: ABFS - Improve HTTPS Performance Over Java Based 
> Client.pdf, HADOOP-15669-HADOOP-15407-01.patch, 
> HADOOP-15669-HADOOP-15407-02.patch, HADOOP-15669-HADOOP-15407-03.patch, 
> HADOOP-15669-HADOOP-15407-04.patch
>
>
> We see approximately 50% worse throughput for ABFS over HTTPs vs HTTP.  Lets 
> perform a detailed measurement and see what can be done to improve throughput.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15700) ABFS: Failure in OpenSSLProvider should fall back to JSSE

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15700:

Issue Type: Sub-task  (was: Bug)
Parent: HADOOP-15407

> ABFS: Failure in OpenSSLProvider should fall back to JSSE
> -
>
> Key: HADOOP-15700
> URL: https://issues.apache.org/jira/browse/HADOOP-15700
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Vishwajeet Dusane
>Assignee: Vishwajeet Dusane
>Priority: Major
> Attachments: HADOOP-15700-HADOOP-15407-01.patch
>
>
> Failure to {{OpenSSLProvider.register()}} should fall back to default JSSE 
> initialization. This is needed to support Java 7 in case the HADOOP-15669 is 
> back-ported to support Java7.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15700) ABFS: Failure in OpenSSLProvider should fall back to JSSE

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596430#comment-16596430
 ] 

Steve Loughran commented on HADOOP-15700:
-

there's never any guarantee that a java-8 compiled artifact will work on java7; 
even if language-level = 8, signature changes in the java APIs can break 
linking, 

Is the problem here you want to run the ABFS client on Java 7 but the SSL Lib 
used is Java 8+ only? If so, yes, this makes sense

> ABFS: Failure in OpenSSLProvider should fall back to JSSE
> -
>
> Key: HADOOP-15700
> URL: https://issues.apache.org/jira/browse/HADOOP-15700
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Vishwajeet Dusane
>Assignee: Vishwajeet Dusane
>Priority: Major
> Attachments: HADOOP-15700-HADOOP-15407-01.patch
>
>
> Failure to {{OpenSSLProvider.register()}} should fall back to default JSSE 
> initialization. This is needed to support Java 7 in case the HADOOP-15669 is 
> back-ported to support Java7.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15697) memory leak in distcp run method

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596428#comment-16596428
 ] 

Steve Loughran commented on HADOOP-15697:
-

Can you tag this JIRA with the hadoop version you are seeing this on? If its a 
commercial product, just say the name in a comment and we'll guess the closest 
ASF release to it. thanks

> memory leak in distcp run method
> 
>
> Key: HADOOP-15697
> URL: https://issues.apache.org/jira/browse/HADOOP-15697
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: mahesh kumar behera
>Priority: Major
> Attachments: gc-root.png
>
>
> in distcp run method, execute is called but the job is not closed. This is 
> causing OOM error in Hive replication. I think in run method , shouldBlock 
> should be set to true and execute should be called within try-with-resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15697) memory leak in distcp run method

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15697:

Component/s: tools/distcp

> memory leak in distcp run method
> 
>
> Key: HADOOP-15697
> URL: https://issues.apache.org/jira/browse/HADOOP-15697
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: mahesh kumar behera
>Priority: Major
> Attachments: gc-root.png
>
>
> in distcp run method, execute is called but the job is not closed. This is 
> causing OOM error in Hive replication. I think in run method , shouldBlock 
> should be set to true and execute should be called within try-with-resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15695) In wasbFS user group always set via core-site

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596424#comment-16596424
 ] 

Steve Loughran commented on HADOOP-15695:
-

+[~tmarquardt]

> In wasbFS user group always set via core-site
> -
>
> Key: HADOOP-15695
> URL: https://issues.apache.org/jira/browse/HADOOP-15695
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 2.8.4, 3.0.3
>Reporter: Denis Zhigula
>Priority: Major
>
>  NativeAzureFileSystem:2183
> {code:java}
> PermissionStatus createPermissionStatus(FsPermission permission)
> throws IOException {
> // Create the permission status for this file based on current user
> return new PermissionStatus(
> UserGroupInformation.getCurrentUser().getShortUserName(),
> getConf().get(AZURE_DEFAULT_GROUP_PROPERTY_NAME,
> AZURE_DEFAULT_GROUP_DEFAULT),
> permission);
> {code}
> suggested fix:to set a user group via the core-site.xml only if this key is 
> present
> {code:java}
> PermissionStatus createPermissionStatus(FsPermission permission) throws 
> IOException {
> // Create the permission status for this file based on current user
> return new 
> PermissionStatus(UserGroupInformation.getCurrentUser().getShortUserName(),
> getConf().get(AZURE_DEFAULT_GROUP_PROPERTY_NAME) == null
> ? UserGroupInformation.getCurrentUser().getPrimaryGroupName()
> : getConf().get(AZURE_DEFAULT_GROUP_PROPERTY_NAME, 
> AZURE_DEFAULT_GROUP_DEFAULT),
> permission);
> }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15680) ITestNativeAzureFileSystemConcurrencyLive times out

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596417#comment-16596417
 ] 

Steve Loughran commented on HADOOP-15680:
-

looks good. 
+1

> ITestNativeAzureFileSystemConcurrencyLive times out
> ---
>
> Key: HADOOP-15680
> URL: https://issues.apache.org/jira/browse/HADOOP-15680
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Major
> Attachments: HADOOP-15680.001.patch, HADOOP-15680.002.patch
>
>
> When I am running tests locally ITestNativeAzureFileSystemConcurrencyLive 
> sometimes times out.
> I would like to increase the timeout to avoid unnecessary noise.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15635) s3guard set-capacity command to fail fast if bucket is unguarded

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596397#comment-16596397
 ] 

Steve Loughran commented on HADOOP-15635:
-

well, {{S3AFileSystem.hasMetadataStore()}} returns true for a null store, so 
for now we can create the FS and run with it. 

bq. but we also need to make sure that the user didn't actually provide a 
specific implementation and table name in the form of the -m URL. Does that 
make sense?

yes

> s3guard set-capacity command to fail fast if bucket is unguarded
> 
>
> Key: HADOOP-15635
> URL: https://issues.apache.org/jira/browse/HADOOP-15635
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-15635.001.patch
>
>
> If you try to do {{hadoop s3guard set-capacity s3a://landsat-pds}}, or any 
> other bucket which exists but doesn't have s3guard enabled, you get a stack 
> trace reporting that the ddb table doesn't exist.
> the command should check for the bucket being guarded and fail on that



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-14734) add option to tag DDB table(s) created

2018-08-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-14734:
---

Assignee: Abraham Fine  (was: Gabor Bota)

> add option to tag DDB table(s) created
> --
>
> Key: HADOOP-14734
> URL: https://issues.apache.org/jira/browse/HADOOP-14734
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Abraham Fine
>Priority: Minor
> Attachments: HADOOP-14734-001.patch, HADOOP-14734-002.patch, 
> HADOOP-14734-003.patch
>
>
> Many organisations have a "no untagged" resource policy; s3guard runs into 
> this when a table is created untagged. If there's a strict "delete untagged 
> resources" policy, the tables will go without warning.
> Proposed: we add an option which can be used to declare the tags for a table 
> when created, use it in creation. No need to worry about updating/viewing 
> tags, as the AWS console can do that



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15688) ABFS: InputStream wrapped in FSDataInputStream twice

2018-08-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596378#comment-16596378
 ] 

Steve Loughran commented on HADOOP-15688:
-

we should make supporting byte buffer readable another hasCapabilities feature, 
as again, we can't use the presence/absence of interfaces as evidence that a 
feature is actually available

> ABFS: InputStream wrapped in FSDataInputStream twice
> 
>
> Key: HADOOP-15688
> URL: https://issues.apache.org/jira/browse/HADOOP-15688
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>Priority: Major
> Attachments: HADOOP-15688-HADOOP-15407-002.patch, 
> HADOOP-15688.001.patch
>
>
> I can't read Parquet files from ABFS. It has 2 different implementations to 
> read seekable streams, and it'll use the one that uses ByteBuffer reads if it 
> can. It currently decides to use the ByteBuffer read implementation because 
> the FSDataInputStream it gets back wraps another FSDataInputStream, which 
> implements ByteBufferReadable.
> That's not the most robust way to check that ByteBufferReads are supported by 
> the ultimately underlying InputStream, but it's unnecessary and probably a 
> mistake to double-wrap the InputStream, so let's not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   >