[jira] [Commented] (HADOOP-16104) Wasb tests to downgrade to skip when test a/c is namespace enabled

2019-02-13 Thread Masatake Iwasaki (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767908#comment-16767908
 ] 

Masatake Iwasaki commented on HADOOP-16104:
---

{quote}I think straight forward workaround is disabling wasb tests by removing 
fs.azure.wasb.account.name and fs.contract.test.fs.wasb from 
azure-auth-keys.xml since NativeAzureFileSystem does not work with the storage 
account.
{quote}
or using independent test account for wasb.

> Wasb tests to downgrade to skip when test a/c is namespace enabled
> --
>
> Key: HADOOP-16104
> URL: https://issues.apache.org/jira/browse/HADOOP-16104
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure, test
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Masatake Iwasaki
>Priority: Major
>
> When you run the abfs tests with a namespace-enabled accounts, all the wasb 
> tests fail "don't yet work with namespace-enabled accounts". This should be 
> downgraded to a test skip, somehow



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16104) Wasb tests to downgrade to skip when test a/c is namespace enabled

2019-02-13 Thread Masatake Iwasaki (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767897#comment-16767897
 ] 

Masatake Iwasaki commented on HADOOP-16104:
---

I played around with the storage account in Japan East.

This issue arises when I run integration/contract tests of both wasb and abfs 
using StorageV2 account with hierarchical namespace enabled. I think straight 
forward workaround is disabling wasb tests by removing 
fs.azure.wasb.account.name and fs.contract.test.fs.wasb from 
azure-auth-keys.xml since NativeAzureFileSystem does not work with the storage 
account.

For abfs, there is already ITestGetNameSpaceEnabled which checks mismatch 
between account property and test configuration 
(fs.azure.test.namespace.enabled). I think we should just skip tests requiring 
XNS based on the value of {{AzureBlobFileSystem#getIsNamespaceEnabled}}. We 
don't need tuning knob here.

Since we can not get hierarchical namespace settings in wasb side due to 
capability of azure-storage library, adding tuning knob like 
fs.azure.test.namespace.enabled could be a option as [~ste...@apache.org] 
suggested, while I feel that fixing a wasb doc and testing_azure.md is enough 
here.

> Wasb tests to downgrade to skip when test a/c is namespace enabled
> --
>
> Key: HADOOP-16104
> URL: https://issues.apache.org/jira/browse/HADOOP-16104
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure, test
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Masatake Iwasaki
>Priority: Major
>
> When you run the abfs tests with a namespace-enabled accounts, all the wasb 
> tests fail "don't yet work with namespace-enabled accounts". This should be 
> downgraded to a test skip, somehow



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16110) ~/hadoop-env doesn't support HADOOP_OPTIONAL_TOOLS

2019-02-13 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767756#comment-16767756
 ] 

Allen Wittenauer edited comment on HADOOP-16110 at 2/14/19 1:03 AM:


HADOOP_OPTIONAL_TOOLS is one of the big exceptions.  It was designed to be a 
cluster-wide setting.

The problem is user expectation: a local HADOOP_OPTIONAL_TOOLS setting does not 
translate (AFAIK) into a remote resource manager being able to renew tokens for 
a given optional service or those jars being in the classpath on the node 
manager.  There should probably an error message for settings that don't work, 
but that's fairly hard to implement correctly.


was (Author: aw):
HADOOP_OPTIONAL_TOOLS is one of the big exceptions.  It was designed to be a 
cluster-wide setting.

The problem is user expectation: a local HADOOP_OPTIONAL_TOOLS setting does not 
translate (AFAIK) into the resource manager being able to renew tokens for a 
given optional service or those jars being in the classpath on the node 
manager.  There should probably an error message for settings that don't work, 
but that's fairly hard to implement correctly.

> ~/hadoop-env doesn't support HADOOP_OPTIONAL_TOOLS
> --
>
> Key: HADOOP-16110
> URL: https://issues.apache.org/jira/browse/HADOOP-16110
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: bin
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Priority: Minor
>
> if you set {{HADOOP_OPTIONAL_TOOLS}} in ~.hadoop-env, it doesn't get picked 
> up because the HADOOP_OPTIONAL_TOOLS expansion takes place in the parse 
> process way before {{hadoop_exec_user_hadoopenv}} is invoked.
> Unless I've really misunderstood what ~/.hadoop-env is meant to do "let me 
> set hadoop env vars", I'd have expected that tools env var examining (and so: 
> loading of optional tools) to take place after



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16110) ~/hadoop-env doesn't support HADOOP_OPTIONAL_TOOLS

2019-02-13 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767756#comment-16767756
 ] 

Allen Wittenauer commented on HADOOP-16110:
---

HADOOP_OPTIONAL_TOOLS is one of the big exceptions.  It was designed to be a 
cluster-wide setting.

The problem is user expectation: a local HADOOP_OPTIONAL_TOOLS setting does not 
translate (AFAIK) into the resource manager being able to renew tokens for a 
given optional service or those jars being in the classpath on the node 
manager.  There should probably an error message for settings that don't work, 
but that's fairly hard to implement correctly.

> ~/hadoop-env doesn't support HADOOP_OPTIONAL_TOOLS
> --
>
> Key: HADOOP-16110
> URL: https://issues.apache.org/jira/browse/HADOOP-16110
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: bin
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Priority: Minor
>
> if you set {{HADOOP_OPTIONAL_TOOLS}} in ~.hadoop-env, it doesn't get picked 
> up because the HADOOP_OPTIONAL_TOOLS expansion takes place in the parse 
> process way before {{hadoop_exec_user_hadoopenv}} is invoked.
> Unless I've really misunderstood what ~/.hadoop-env is meant to do "let me 
> set hadoop env vars", I'd have expected that tools env var examining (and so: 
> loading of optional tools) to take place after



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16110) ~/hadoop-env doesn't support HADOOP_OPTIONAL_TOOLS

2019-02-13 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767699#comment-16767699
 ] 

Steve Loughran commented on HADOOP-16110:
-

thoughts [~aw]?

Spent a while trying to track this little quirk down

> ~/hadoop-env doesn't support HADOOP_OPTIONAL_TOOLS
> --
>
> Key: HADOOP-16110
> URL: https://issues.apache.org/jira/browse/HADOOP-16110
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: bin
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Priority: Minor
>
> if you set {{HADOOP_OPTIONAL_TOOLS}} in ~.hadoop-env, it doesn't get picked 
> up because the HADOOP_OPTIONAL_TOOLS expansion takes place in the parse 
> process way before {{hadoop_exec_user_hadoopenv}} is invoked.
> Unless I've really misunderstood what ~/.hadoop-env is meant to do "let me 
> set hadoop env vars", I'd have expected that tools env var examining (and so: 
> loading of optional tools) to take place after



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16110) ~/hadoop-env doesn't support HADOOP_OPTIONAL_TOOLS

2019-02-13 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-16110:
---

 Summary: ~/hadoop-env doesn't support HADOOP_OPTIONAL_TOOLS
 Key: HADOOP-16110
 URL: https://issues.apache.org/jira/browse/HADOOP-16110
 Project: Hadoop Common
  Issue Type: Bug
  Components: bin
Affects Versions: 3.3.0
Reporter: Steve Loughran


if you set {{HADOOP_OPTIONAL_TOOLS}} in ~.hadoop-env, it doesn't get picked up 
because the HADOOP_OPTIONAL_TOOLS expansion takes place in the parse process 
way before {{hadoop_exec_user_hadoopenv}} is invoked.

Unless I've really misunderstood what ~/.hadoop-env is meant to do "let me set 
hadoop env vars", I'd have expected that tools env var examining (and so: 
loading of optional tools) to take place after



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15625) S3A input stream to use etags to detect changed source files

2019-02-13 Thread Ben Roling (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767530#comment-16767530
 ] 

Ben Roling commented on HADOOP-15625:
-

{quote}I'd like to have this fail with some special subclass of EOFException, 
i.e RemotFileChangedException or similar
{quote}
I'm having difficulty with this strategy and to be honest it doesn't quite feel 
like the right approach.  It is hard to ensure that an EOFException subclass 
isn't treated as "normal" and ignored.

I tried a strategy of updating the various places EOFException is caught and 
turned into -1 in S3AInputStream to check instanceof RemoteFileChangedException 
and rethrow instead of return -1, but that wasn't good enough since 
[FSInputStream itself does 
this|https://github.com/apache/hadoop/blob/release-3.2.0-RC1/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputStream.java#L76]
 and that's where S3AInputStream.read(long position, byte[] buffer, int offset, 
int length) is currently routed.  I'm hesitant to override or change that 
method.

Are you sure you want RemoteFileChangedException to be a subclass of 
EOFException rather than a direct subclass of IOException or some other 
IOException type?

> S3A input stream to use etags to detect changed source files
> 
>
> Key: HADOOP-15625
> URL: https://issues.apache.org/jira/browse/HADOOP-15625
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Major
> Attachments: HADOOP-15625-001.patch, HADOOP-15625-002.patch, 
> HADOOP-15625-003.patch
>
>
> S3A input stream doesn't handle changing source files any better than the 
> other cloud store connectors. Specifically: it doesn't noticed it has 
> changed, caches the length from startup, and whenever a seek triggers a new 
> GET, you may get one of: old data, new data, and even perhaps go from new 
> data to old data due to eventual consistency.
> We can't do anything to stop this, but we could detect changes by
> # caching the etag of the first HEAD/GET (we don't get that HEAD on open with 
> S3Guard, BTW)
> # on future GET requests, verify the etag of the response
> # raise an IOE if the remote file changed during the read.
> It's a more dramatic failure, but it stops changes silently corrupting things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15686) Supress bogus AbstractWadlGeneratorGrammarGenerator in KMS stderr

2019-02-13 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767455#comment-16767455
 ] 

Xiaoyu Yao commented on HADOOP-15686:
-

[~jojochuang], thanks for the pointer on the performance issue with 
jul_to_slf4j.

However, in patch v2, we only disable jul for 
com.sun.jersey.server.wadl.generators class. This will be different from 
previous patch where all jul is redirected. We may still get JUL from other 
jersey class?

Have you consider installing LevelChangePropagator along with jul_to_slf4j 
approach (before HADOO-13597) to eliminate the 60x overhead as mentioned in the 
same slf4j doc?

> Supress bogus AbstractWadlGeneratorGrammarGenerator in KMS stderr
> -
>
> Key: HADOOP-15686
> URL: https://issues.apache.org/jira/browse/HADOOP-15686
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HADOOP-15686.001.patch, HADOOP-15686.002.patch
>
>
> After we switched underlying system of KMS from Tomcat to Jetty, we started 
> to observe a lot of bogus messages like the follow [1]. It is harmless but 
> very annoying. Let's suppress it in log4j configuration.
> [1]
> {quote}
> Aug 20, 2018 11:26:17 AM 
> com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator 
> buildModelAndSchemas
> SEVERE: Failed to generate the schema for the JAX-B elements
> com.sun.xml.bind.v2.runtime.IllegalAnnotationsException: 2 counts of 
> IllegalAnnotationExceptions
> java.util.Map is an interface, and JAXB can't handle interfaces.
>   this problem is related to the following location:
>   at java.util.Map
> java.util.Map does not have a no-arg default constructor.
>   this problem is related to the following location:
>   at java.util.Map
>   at 
> com.sun.xml.bind.v2.runtime.IllegalAnnotationsException$Builder.check(IllegalAnnotationsException.java:106)
>   at 
> com.sun.xml.bind.v2.runtime.JAXBContextImpl.getTypeInfoSet(JAXBContextImpl.java:489)
>   at 
> com.sun.xml.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:319)
>   at 
> com.sun.xml.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1170)
>   at 
> com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:145)
>   at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:247)
>   at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:234)
>   at javax.xml.bind.ContextFinder.find(ContextFinder.java:441)
>   at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:641)
>   at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:584)
>   at 
> com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator.buildModelAndSchemas(WadlGeneratorJAXBGrammarGenerator.java:169)
>   at 
> com.sun.jersey.server.wadl.generators.AbstractWadlGeneratorGrammarGenerator.createExternalGrammar(AbstractWadlGeneratorGrammarGenerator.java:405)
>   at com.sun.jersey.server.wadl.WadlBuilder.generate(WadlBuilder.java:149)
>   at 
> com.sun.jersey.server.impl.wadl.WadlApplicationContextImpl.getApplication(WadlApplicationContextImpl.java:119)
>   at 
> com.sun.jersey.server.impl.wadl.WadlApplicationContextImpl.getApplication(WadlApplicationContextImpl.java:138)
>   at 
> com.sun.jersey.server.impl.wadl.WadlMethodFactory$WadlOptionsMethodDispatcher.dispatch(WadlMethodFactory.java:110)
>   at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
>   at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>   at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
>   at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>   at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
>   at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
>   at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
>   at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
>   at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
>   at 
> 

[jira] [Comment Edited] (HADOOP-15686) Supress bogus AbstractWadlGeneratorGrammarGenerator in KMS stderr

2019-02-13 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767455#comment-16767455
 ] 

Xiaoyu Yao edited comment on HADOOP-15686 at 2/13/19 6:45 PM:
--

[~jojochuang], thanks for the pointer on the performance issue with 
jul_to_slf4j.

However, in patch v2, we only disable jul for 
com.sun.jersey.server.wadl.generators class. This will be different from 
previous patch where all jul is redirected. We may still get JUL from other 
jersey class?

Have you consider installing LevelChangePropagator along with jul_to_slf4j 
approach (before HADOOP-13597) to eliminate the 60x overhead as mentioned in 
the same slf4j doc?


was (Author: xyao):
[~jojochuang], thanks for the pointer on the performance issue with 
jul_to_slf4j.

However, in patch v2, we only disable jul for 
com.sun.jersey.server.wadl.generators class. This will be different from 
previous patch where all jul is redirected. We may still get JUL from other 
jersey class?

Have you consider installing LevelChangePropagator along with jul_to_slf4j 
approach (before HADOO-13597) to eliminate the 60x overhead as mentioned in the 
same slf4j doc?

> Supress bogus AbstractWadlGeneratorGrammarGenerator in KMS stderr
> -
>
> Key: HADOOP-15686
> URL: https://issues.apache.org/jira/browse/HADOOP-15686
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HADOOP-15686.001.patch, HADOOP-15686.002.patch
>
>
> After we switched underlying system of KMS from Tomcat to Jetty, we started 
> to observe a lot of bogus messages like the follow [1]. It is harmless but 
> very annoying. Let's suppress it in log4j configuration.
> [1]
> {quote}
> Aug 20, 2018 11:26:17 AM 
> com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator 
> buildModelAndSchemas
> SEVERE: Failed to generate the schema for the JAX-B elements
> com.sun.xml.bind.v2.runtime.IllegalAnnotationsException: 2 counts of 
> IllegalAnnotationExceptions
> java.util.Map is an interface, and JAXB can't handle interfaces.
>   this problem is related to the following location:
>   at java.util.Map
> java.util.Map does not have a no-arg default constructor.
>   this problem is related to the following location:
>   at java.util.Map
>   at 
> com.sun.xml.bind.v2.runtime.IllegalAnnotationsException$Builder.check(IllegalAnnotationsException.java:106)
>   at 
> com.sun.xml.bind.v2.runtime.JAXBContextImpl.getTypeInfoSet(JAXBContextImpl.java:489)
>   at 
> com.sun.xml.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:319)
>   at 
> com.sun.xml.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1170)
>   at 
> com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:145)
>   at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:247)
>   at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:234)
>   at javax.xml.bind.ContextFinder.find(ContextFinder.java:441)
>   at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:641)
>   at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:584)
>   at 
> com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator.buildModelAndSchemas(WadlGeneratorJAXBGrammarGenerator.java:169)
>   at 
> com.sun.jersey.server.wadl.generators.AbstractWadlGeneratorGrammarGenerator.createExternalGrammar(AbstractWadlGeneratorGrammarGenerator.java:405)
>   at com.sun.jersey.server.wadl.WadlBuilder.generate(WadlBuilder.java:149)
>   at 
> com.sun.jersey.server.impl.wadl.WadlApplicationContextImpl.getApplication(WadlApplicationContextImpl.java:119)
>   at 
> com.sun.jersey.server.impl.wadl.WadlApplicationContextImpl.getApplication(WadlApplicationContextImpl.java:138)
>   at 
> com.sun.jersey.server.impl.wadl.WadlMethodFactory$WadlOptionsMethodDispatcher.dispatch(WadlMethodFactory.java:110)
>   at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
>   at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>   at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
>   at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>   at 
> 

[jira] [Commented] (HADOOP-16093) Move DurationInfo from hadoop-aws to hadoop-common org.apache.fs.impl

2019-02-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767435#comment-16767435
 ] 

Hadoop QA commented on HADOOP-16093:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 9 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
21s{color} | {color:green} root: The patch generated 0 new + 15 unchanged - 1 
fixed = 15 total (was 16) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
34s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
50s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}111m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HADOOP-16093 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12958600/HADOOP-16093.006.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 9e0a5d5fe23a 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 
31 10:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 00c5ffa |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/15921/testReport/ |
| 

[jira] [Commented] (HADOOP-15229) Add FileSystem builder-based openFile() API to match createFile(); S3A to implement S3 Select through this API.

2019-02-13 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767395#comment-16767395
 ] 

Eric Yang commented on HADOOP-15229:


[~ste...@apache.org] the previous comment was filed as HADOOP-16106.  It has 
been fixed.  Thanks for the follow up.

> Add FileSystem builder-based openFile() API to match createFile(); S3A to 
> implement S3 Select through this API.
> ---
>
> Key: HADOOP-15229
> URL: https://issues.apache.org/jira/browse/HADOOP-15229
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15229-001.patch, HADOOP-15229-002.patch, 
> HADOOP-15229-003.patch, HADOOP-15229-004.patch, HADOOP-15229-004.patch, 
> HADOOP-15229-005.patch, HADOOP-15229-006.patch, HADOOP-15229-007.patch, 
> HADOOP-15229-009.patch, HADOOP-15229-010.patch, HADOOP-15229-011.patch, 
> HADOOP-15229-012.patch, HADOOP-15229-013.patch, HADOOP-15229-014.patch, 
> HADOOP-15229-015.patch, HADOOP-15229-016.patch, HADOOP-15229-017.patch, 
> HADOOP-15229-018.patch, HADOOP-15229-019.patch, HADOOP-15229-020.patch
>
>
> Replicate HDFS-1170 and HADOOP-14365 with an API to open files.
> A key requirement of this is not HDFS, it's to put in the fadvise policy for 
> working with object stores, where getting the decision to do a full GET and 
> TCP abort on seek vs smaller GETs is fundamentally different: the wrong 
> option can cost you minutes. S3A and Azure both have adaptive policies now 
> (first backward seek), but they still don't do it that well.
> Columnar formats (ORC, Parquet) should be able to say "fs.input.fadvise" 
> "random" as an option when they open files; I can imagine other options too.
> The Builder model of [~eddyxu] is the one to mimic, method for method. 
> Ideally with as much code reuse as possible



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16102) FilterFileSystem does not implement getScheme

2019-02-13 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767390#comment-16767390
 ] 

Steve Loughran commented on HADOOP-16102:
-

I'm actually reluctant to make changes here. 

The problem is that the FileSystem.getScheme() is invoked as part of the FS 
Service load mechanism, without initializing the filesystem. At which point, 
there's no {{fs}} to pass this call on. 

We could be clever and say "united => "filter"" but once inited switch to the 
inner scheme, but there's a risk of that causing different confusion as the 
scheme of an FS changes during its life. I don't know what the consequences 
will be there: if anyone is expecting it to be a constant for use in a map of 
some form, they'll be surprised. 

Now, FilterFS isn't registered for discovery in 
hadoop-common-project/hadoop-common/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem,
 so FileSystem isn't mapping filter -> FilterFileSystem.class, but I don't know 
what is happening elsewhere.

> FilterFileSystem does not implement getScheme
> -
>
> Key: HADOOP-16102
> URL: https://issues.apache.org/jira/browse/HADOOP-16102
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Todd Owen
>Priority: Minor
>
> Calling {{getScheme}} on a {{FilterFileSystem}} throws 
> {{UnsupportedOperationException}}, which is the default provided by the base 
> class. Instead, it should return the scheme of the underlying ("filtered") 
> filesystem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] noslowerdna edited a comment on issue #469: HADOOP-15281: Add -direct DistCp option

2019-02-13 Thread GitBox
noslowerdna edited a comment on issue #469: HADOOP-15281: Add -direct DistCp 
option
URL: https://github.com/apache/hadoop/pull/469#issuecomment-461435632
 
 
   This was committed for: 
   
   - 3.3.0 
https://github.com/apache/hadoop/commit/de804e53b9d20a2df75a4c7252bf83ed52011488
   - 3.2.1 
https://github.com/apache/hadoop/commit/36f3e775d476eb848f33569bfbbab4872b11d9df
   - 3.1.3 
https://github.com/apache/hadoop/commit/49d54633e0f4bd388c00d591e90666dbb7633c9f
   
   Closing the PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] noslowerdna edited a comment on issue #469: HADOOP-15281: Add -direct DistCp option

2019-02-13 Thread GitBox
noslowerdna edited a comment on issue #469: HADOOP-15281: Add -direct DistCp 
option
URL: https://github.com/apache/hadoop/pull/469#issuecomment-461435632
 
 
   This was merged for: 
   
   - 3.3.0 
https://github.com/apache/hadoop/commit/de804e53b9d20a2df75a4c7252bf83ed52011488
   - 3.2.1 
https://github.com/apache/hadoop/commit/36f3e775d476eb848f33569bfbbab4872b11d9df
   - 3.1.3 
https://github.com/apache/hadoop/commit/49d54633e0f4bd388c00d591e90666dbb7633c9f
   
   Closing the PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

2019-02-13 Thread Kai Xie (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767388#comment-16767388
 ] 

Kai Xie commented on HADOOP-16018:
--

Hi [~ste...@apache.org] the patch branch-2-006 is ready for review.

> DistCp won't reassemble chunks when blocks per chunk > 0
> 
>
> Key: HADOOP-16018
> URL: https://issues.apache.org/jira/browse/HADOOP-16018
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.2.0, 2.9.2
>Reporter: Kai Xie
>Assignee: Kai Xie
>Priority: Major
> Fix For: 3.0.4, 3.2.1, 3.1.3
>
> Attachments: HADOOP-16018-002.patch, HADOOP-16018-branch-2-002.patch, 
> HADOOP-16018-branch-2-002.patch, HADOOP-16018-branch-2-003.patch, 
> HADOOP-16018-branch-2-004.patch, HADOOP-16018-branch-2-004.patch, 
> HADOOP-16018-branch-2-005.patch, HADOOP-16018-branch-2-005.patch, 
> HADOOP-16018-branch-2-006.patch, HADOOP-16018.01.patch
>
>
> I was investigating why hadoop-distcp-2.9.2 won't reassemble chunks of the 
> same file when blocks per chunk has been set > 0.
> In the CopyCommitter::commitJob, this logic can prevent chunks from 
> reassembling if blocks per chunk is equal to 0:
> {code:java}
> if (blocksPerChunk > 0) {
>   concatFileChunks(conf);
> }
> {code}
> Then in CopyCommitter's ctor, blocksPerChunk is initialised from the config:
> {code:java}
> blocksPerChunk = context.getConfiguration().getInt(
> DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel(), 0);
> {code}
>  
> But here the config key DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel() 
> will always returns empty string because it is constructed without config 
> label:
> {code:java}
> BLOCKS_PER_CHUNK("",
> new Option("blocksperchunk", true, "If set to a positive value, files"
> + "with more blocks than this value will be split into chunks of "
> + " blocks to be transferred in parallel, and "
> + "reassembled on the destination. By default,  is "
> + "0 and the files will be transmitted in their entirety without "
> + "splitting. This switch is only applicable when the source file "
> + "system implements getBlockLocations method and the target file "
> + "system implements concat method"))
> {code}
> As a result it will fall back to the default value 0 for blocksPerChunk, and 
> prevent the chunks from reassembling.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16093) Move DurationInfo from hadoop-aws to hadoop-common org.apache.fs.impl

2019-02-13 Thread Abhishek Modi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Modi updated HADOOP-16093:
---
Attachment: HADOOP-16093.006.patch

> Move DurationInfo from hadoop-aws to hadoop-common org.apache.fs.impl
> -
>
> Key: HADOOP-16093
> URL: https://issues.apache.org/jira/browse/HADOOP-16093
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, util
>Reporter: Steve Loughran
>Assignee: Abhishek Modi
>Priority: Minor
> Attachments: HADOOP-16093.001.patch, HADOOP-16093.002.patch, 
> HADOOP-16093.003.patch, HADOOP-16093.004.patch, HADOOP-16093.005.patch, 
> HADOOP-16093.006.patch
>
>
> It'd be useful to have DurationInfo usable in other places (e.g. distcp, 
> abfs, ...). But as it is in hadoop-aws under 
> {{org.apache.hadoop.fs.s3a.commit.DurationInfo
> org.apache.hadoop.fs.s3a.commit.DurationInfo}} we can't do that
> Move it.
> We'll have to rename the Duration class in the process, as java 8 time has a 
> class of that name too. Maybe "OperationDuration", with DurationInfo a 
> subclass of that
> Probably need a test too, won't it?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

2019-02-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767353#comment-16767353
 ] 

Hadoop QA commented on HADOOP-16018:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
30s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} branch-2 passed with JDK v1.8.0_191 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} branch-2 passed with JDK v1.8.0_191 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed with JDK v1.8.0_191 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed with JDK v1.8.0_191 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
14s{color} | {color:green} hadoop-distcp in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:da67579 |
| JIRA Issue | HADOOP-16018 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12958592/HADOOP-16018-branch-2-006.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cc81779902c5 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 
31 10:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / da67579 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| Multi-JDK versions |  /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 
/usr/lib/jvm/java-8-openjdk-amd64:1.8.0_191 |
| findbugs | v3.0.0 |
|  Test 

[jira] [Commented] (HADOOP-16093) Move DurationInfo from hadoop-aws to hadoop-common org.apache.fs.impl

2019-02-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767347#comment-16767347
 ] 

Hadoop QA commented on HADOOP-16093:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 9 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
41s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
10s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 55s{color} | {color:orange} root: The patch generated 2 new + 15 unchanged - 
1 fixed = 17 total (was 16) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  0s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 23s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
44s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}106m  3s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.util.TestReadWriteDiskValidator |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HADOOP-16093 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12958573/HADOOP-16093.005.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 91b867649b74 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 00c5ffa |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 

[jira] [Created] (HADOOP-16109) Parquet reading S3AFileSystem causes EOF

2019-02-13 Thread Dave Christianson (JIRA)
Dave Christianson created HADOOP-16109:
--

 Summary: Parquet reading S3AFileSystem causes EOF
 Key: HADOOP-16109
 URL: https://issues.apache.org/jira/browse/HADOOP-16109
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Affects Versions: 3.1.0
Reporter: Dave Christianson


When using S3AFileSystem to read Parquet files a specific set of circumstances 
causes an  EOFException that is not thrown when reading the same file from 
local disk

Note this has only been observed under specific circumstances:
  - when the reader is doing a projection (will cause it to do a seek backwards 
and put the filesystem into random mode)
 - when the file is larger than the readahead buffer size
 - when the seek behavior of the Parquet reader causes the reader to seek 
towards the end of the current input stream without reopening, such that the 
next read on the currently open stream will read past the end of the currently 
open stream.

Exception from Parquet reader is as follows:

Caused by: java.io.EOFException: Reached the end of stream with 51 bytes left 
to read
 at 
org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:104)
 at 
org.apache.parquet.io.DelegatingSeekableInputStream.readFullyHeapBuffer(DelegatingSeekableInputStream.java:127)
 at 
org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:91)
 at 
org.apache.parquet.hadoop.ParquetFileReader$ConsecutiveChunkList.readAll(ParquetFileReader.java:1174)
 at 
org.apache.parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:805)
 at 
org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:127)
 at 
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:222)
 at 
org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:207)
 at 
org.apache.flink.api.java.hadoop.mapreduce.HadoopInputFormatBase.fetchNext(HadoopInputFormatBase.java:206)
 at 
org.apache.flink.api.java.hadoop.mapreduce.HadoopInputFormatBase.reachedEnd(HadoopInputFormatBase.java:199)
 at 
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:190)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
 at java.lang.Thread.run(Thread.java:748)

The following example program generate the same root behavior (sans finding a 
Parquet file that happens to trigger this condition) by purposely reading past 
the already active readahead range on any file >= 1029 bytes in size.. 


{code:java}
final Configuration conf = new Configuration();
conf.set("fs.s3a.readahead.range", "1K");
conf.set("fs.s3a.experimental.input.fadvise", "random");

final FileSystem fs = FileSystem.get(path.toUri(), conf);
// forward seek reading across readahead boundary
try (FSDataInputStream in = fs.open(path)) {
final byte[] temp = new byte[5];
in.readByte();
in.readFully(1023, temp); // <-- works
}
// forward seek reading from end of readahead boundary
try (FSDataInputStream in = fs.open(path)) {
 final byte[] temp = new byte[5];
 in.readByte();
 in.readFully(1024, temp); // <-- throws EOFException
}
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15625) S3A input stream to use etags to detect changed source files

2019-02-13 Thread Ben Roling (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767309#comment-16767309
 ] 

Ben Roling commented on HADOOP-15625:
-

{quote}With the way the code is structured now, it would take another remote 
call to even know whether the new file is longer or shorter as non-matching 
eTag results in a null return value from getObject() call.
{quote}
I just realized it really wouldn't be necessary to introduce another remote 
call to detect this.  I don't really need to use the 
withMatchingETagConstraint() on GetObjectRequest.  I could omit that and just 
check the eTag on the response to see whether it has changed.  That said 
though, I don't know that it makes sense to me to have any different behavior 
in terms of the exception that would be thrown whether the new file is shorter 
or longer so I still am not sure why you might still want tests of both.

> S3A input stream to use etags to detect changed source files
> 
>
> Key: HADOOP-15625
> URL: https://issues.apache.org/jira/browse/HADOOP-15625
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Major
> Attachments: HADOOP-15625-001.patch, HADOOP-15625-002.patch, 
> HADOOP-15625-003.patch
>
>
> S3A input stream doesn't handle changing source files any better than the 
> other cloud store connectors. Specifically: it doesn't noticed it has 
> changed, caches the length from startup, and whenever a seek triggers a new 
> GET, you may get one of: old data, new data, and even perhaps go from new 
> data to old data due to eventual consistency.
> We can't do anything to stop this, but we could detect changes by
> # caching the etag of the first HEAD/GET (we don't get that HEAD on open with 
> S3Guard, BTW)
> # on future GET requests, verify the etag of the response
> # raise an IOE if the remote file changed during the read.
> It's a more dramatic failure, but it stops changes silently corrupting things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16105) WASB in secure mode does not set connectingUsingSAS

2019-02-13 Thread David McGinnis (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767295#comment-16767295
 ] 

David McGinnis edited comment on HADOOP-16105 at 2/13/19 3:20 PM:
--

[~ste...@apache.org]: The fix itself looks fine. Thanks for taking care of this.

As for your testing, you mention above you aren't sure how best to test it. I 
am fairly certain there is a way to do a positive test on this that shows the 
fix works, but after putting in 3-4 hours a couple of nights ago on it, I 
wasn't able to get it exactly right. I was going to put more time in last night 
on this, but got tied up in other things. I'm including the test below as I 
have it now. I'm fairly certain this code (when tweaked to be correct of 
course) will give an Unauthorized exception without the change, and pass with 
the change. Feel free to use or not.

 

@Test
public void testConnectUsingSecureSASSuccess() throws Exception {    

    // Create the test account with SAS credentials.    

    Configuration conf = new Configuration();    

    conf.setBoolean(AzureNativeFileSystemStore.KEY_USE_SECURE_MODE, true);    

    testAccount = AzureBlobStorageTestAccount.create("",  

        EnumSet.of(CreateOptions.UseSas, CreateOptions.CreateContainer),    

    conf); 

    assumeNotNull(testAccount);    

    CloudBlobContainer container = testAccount.getRealContainer(); 

    AzureFileSystemInstrumentation instrumentation = new 
AzureFileSystemInstrumentation(conf);    

    AzureNativeFileSystemStore store = new AzureNativeFileSystemStore();    

    store.initialize(container.getUri(), conf, instrumentation);    

    store.list("/", -1, -1);  

}


was (Author: mcginnda):
[~ste...@apache.org]: The fix itself looks fine. Thanks for taking care of this.

As for your testing, you mention above you aren't sure how best to test it. I 
am fairly certain there is a way to do a positive test on this that shows the 
fix works, but after putting in 3-4 hours a couple of nights ago on it, I 
wasn't able to get it exactly right. I was going to put more time in last night 
on this, but got tied up in other things. I'm including the test below as I 
have it now. I'm fairly certain this code (when tweaked to be correct of 
course) will give an Unauthorized exception without the change, and pass with 
the change. Feel free to use or not.

 

  @Test
  public void testConnectUsingSecureSASSuccess() throws Exception {
    // Create the test account with SAS credentials.
    Configuration conf = new Configuration();
    conf.setBoolean(AzureNativeFileSystemStore.KEY_USE_SECURE_MODE, true);
    testAccount = AzureBlobStorageTestAccount.create("",
    EnumSet.of(CreateOptions.UseSas, CreateOptions.CreateContainer),
    conf);
    assumeNotNull(testAccount);

    CloudBlobContainer container = testAccount.getRealContainer();
    AzureFileSystemInstrumentation instrumentation = new 
AzureFileSystemInstrumentation(conf);
    AzureNativeFileSystemStore store = new AzureNativeFileSystemStore();
    store.initialize(container.getUri(), conf, instrumentation);
    store.list("/", -1, -1);
  }

> WASB in secure mode does not set connectingUsingSAS
> ---
>
> Key: HADOOP-16105
> URL: https://issues.apache.org/jira/browse/HADOOP-16105
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 2.8.5, 3.1.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-16105-001.patch, HADOOP-16105-002.patch
>
>
> If you run WASB in secure mode, it doesn't set {{connectingUsingSAS}} to 
> true, which can break things



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16105) WASB in secure mode does not set connectingUsingSAS

2019-02-13 Thread David McGinnis (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767295#comment-16767295
 ] 

David McGinnis edited comment on HADOOP-16105 at 2/13/19 3:21 PM:
--

[~ste...@apache.org]: The fix itself looks fine. Thanks for taking care of this.

As for your testing, you mention above you aren't sure how best to test it. I 
am fairly certain there is a way to do a positive test on this that shows the 
fix works, but after putting in 3-4 hours a couple of nights ago on it, I 
wasn't able to get it exactly right. I was going to put more time in last night 
on this, but got tied up in other things. I'm including the test below as I 
have it now. I'm fairly certain this code (when tweaked to be correct of 
course) will give an Unauthorized exception without the change, and pass with 
the change. Feel free to use or not.

 

 
{code:java}
@Test
 public void testConnectUsingSecureSASSuccess() throws Exception {
    // Create the test account with SAS credentials.          
Configuration conf = new Configuration();         
conf.setBoolean(AzureNativeFileSystemStore.KEY_USE_SECURE_MODE, true);     
    testAccount = AzureBlobStorageTestAccount.create("",           
EnumSet.of(CreateOptions.UseSas, CreateOptions.CreateContainer),    
  
    conf);
    assumeNotNull(testAccount);    
    
CloudBlobContainer container = testAccount.getRealContainer();      
AzureFileSystemInstrumentation instrumentation = new 
AzureFileSystemInstrumentation(conf);
    AzureNativeFileSystemStore store = new AzureNativeFileSystemStore();     
    store.initialize(container.getUri(), conf, instrumentation);         
store.list("/", -1, -1);   
}
{code}
 


was (Author: mcginnda):
[~ste...@apache.org]: The fix itself looks fine. Thanks for taking care of this.

As for your testing, you mention above you aren't sure how best to test it. I 
am fairly certain there is a way to do a positive test on this that shows the 
fix works, but after putting in 3-4 hours a couple of nights ago on it, I 
wasn't able to get it exactly right. I was going to put more time in last night 
on this, but got tied up in other things. I'm including the test below as I 
have it now. I'm fairly certain this code (when tweaked to be correct of 
course) will give an Unauthorized exception without the change, and pass with 
the change. Feel free to use or not.

 

@Test
public void testConnectUsingSecureSASSuccess() throws Exception {    

    // Create the test account with SAS credentials.    

    Configuration conf = new Configuration();    

    conf.setBoolean(AzureNativeFileSystemStore.KEY_USE_SECURE_MODE, true);    

    testAccount = AzureBlobStorageTestAccount.create("",  

        EnumSet.of(CreateOptions.UseSas, CreateOptions.CreateContainer),    

    conf); 

    assumeNotNull(testAccount);    

    CloudBlobContainer container = testAccount.getRealContainer(); 

    AzureFileSystemInstrumentation instrumentation = new 
AzureFileSystemInstrumentation(conf);    

    AzureNativeFileSystemStore store = new AzureNativeFileSystemStore();    

    store.initialize(container.getUri(), conf, instrumentation);    

    store.list("/", -1, -1);  

}

> WASB in secure mode does not set connectingUsingSAS
> ---
>
> Key: HADOOP-16105
> URL: https://issues.apache.org/jira/browse/HADOOP-16105
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 2.8.5, 3.1.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-16105-001.patch, HADOOP-16105-002.patch
>
>
> If you run WASB in secure mode, it doesn't set {{connectingUsingSAS}} to 
> true, which can break things



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15625) S3A input stream to use etags to detect changed source files

2019-02-13 Thread Ben Roling (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767306#comment-16767306
 ] 

Ben Roling commented on HADOOP-15625:
-

Thanks for the feedback Steve!

It does look like I can upload a patch now.  I'll do that with a version that 
addresses your comments, although I do have some questions.

The reason I got rid of the larger > shorter specific testing is because with 
the new logic it no longer makes a difference if the new file is larger or 
shorter.  Any change to the file will result in a new eTag, which will result 
in the new exception after reopen (after seek backwards).  I am happy to 
include both tests of longer to shorter and shorter to longer if you like, but 
the expected result will be exactly the same.  Can you confirm you still want 
that?  Also, can you clarify that you aren't expecting special behavior for the 
longer-to-shorter scenario?  With the way the code is structured now, it would 
take another remote call to even know whether the new file is longer or shorter 
as non-matching eTag results in a null return value from getObject() call.

With regard to the exception type, I'll do as you suggest.  I interpreted your 
original comment from HADOOP-16085 as steering away from EOFException as also 
steering away from a subclass, but I think I see how I can make a subclass 
work, ensuring the subclass doesn't result in the -1 return value from read() 
like the generic EOFException would.

In terms of testing, I've had an initial read through testing.md, configured my 
auth-keys.xml with `test.fs.s3a.name` and `fs.contract.test.fs.s3a`, and run 
the tests.  My test bucket is in the us-west-2 region in AWS.  I also ran with 
-Ds3guard, although that doesn't seem particularly relevant to this specific 
change.  The result summary is:

 
||Tests||Errors||Failures||Skipped||Success||Rate Time||
|775|1|0|179|76.774%|4,274.058|

 

The 1 error was the following, which was occurring without my changes:
{quote}[ERROR] 
testDestroyNoArgs(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolLocal) Time 
elapsed: 1.877 s <<< ERROR!
java.lang.IndexOutOfBoundsException: toIndex = 1
 at java.util.ArrayList.subListRangeCheck(ArrayList.java:1012)
 at java.util.ArrayList.subList(ArrayList.java:1004)
 at org.apache.hadoop.fs.shell.CommandFormat.parse(CommandFormat.java:89)
 at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.parseArgs(S3GuardTool.java:374)
 at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Destroy.run(S3GuardTool.java:629)
 at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:402)
{quote}
Re-reading here I see I didn't set `fs.s3a.server-side-encryption.key`.  I will 
re-run the tests with that for completeness.

The testing.md does say "The submitter of any patch is required to run *all* 
the integration tests".  Clearly I have not run *all* the tests as 179 tests 
were skipped.  I'm not sure if the correct interpretation here is the literal 
one though.  Do I really need to ensure there are no skipped tests?  I didn't 
run the scale tests, for example.  Am I required to for this change?  I'm not 
sure how many other categories of tests there are that require special 
configuration to enable.  Do I need to find all those and run them?

Also, you mention having problems in the past with OpenStack and Swift.  Do I 
need to run tests against that as well?  I'm not sure of the easiest way for me 
to do that.

> S3A input stream to use etags to detect changed source files
> 
>
> Key: HADOOP-15625
> URL: https://issues.apache.org/jira/browse/HADOOP-15625
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Major
> Attachments: HADOOP-15625-001.patch, HADOOP-15625-002.patch, 
> HADOOP-15625-003.patch
>
>
> S3A input stream doesn't handle changing source files any better than the 
> other cloud store connectors. Specifically: it doesn't noticed it has 
> changed, caches the length from startup, and whenever a seek triggers a new 
> GET, you may get one of: old data, new data, and even perhaps go from new 
> data to old data due to eventual consistency.
> We can't do anything to stop this, but we could detect changes by
> # caching the etag of the first HEAD/GET (we don't get that HEAD on open with 
> S3Guard, BTW)
> # on future GET requests, verify the etag of the response
> # raise an IOE if the remote file changed during the read.
> It's a more dramatic failure, but it stops changes silently corrupting things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Updated] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

2019-02-13 Thread Kai Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Xie updated HADOOP-16018:
-
Status: Patch Available  (was: Open)

> DistCp won't reassemble chunks when blocks per chunk > 0
> 
>
> Key: HADOOP-16018
> URL: https://issues.apache.org/jira/browse/HADOOP-16018
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 2.9.2, 3.2.0
>Reporter: Kai Xie
>Assignee: Kai Xie
>Priority: Major
> Fix For: 3.0.4, 3.2.1, 3.1.3
>
> Attachments: HADOOP-16018-002.patch, HADOOP-16018-branch-2-002.patch, 
> HADOOP-16018-branch-2-002.patch, HADOOP-16018-branch-2-003.patch, 
> HADOOP-16018-branch-2-004.patch, HADOOP-16018-branch-2-004.patch, 
> HADOOP-16018-branch-2-005.patch, HADOOP-16018-branch-2-005.patch, 
> HADOOP-16018-branch-2-006.patch, HADOOP-16018.01.patch
>
>
> I was investigating why hadoop-distcp-2.9.2 won't reassemble chunks of the 
> same file when blocks per chunk has been set > 0.
> In the CopyCommitter::commitJob, this logic can prevent chunks from 
> reassembling if blocks per chunk is equal to 0:
> {code:java}
> if (blocksPerChunk > 0) {
>   concatFileChunks(conf);
> }
> {code}
> Then in CopyCommitter's ctor, blocksPerChunk is initialised from the config:
> {code:java}
> blocksPerChunk = context.getConfiguration().getInt(
> DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel(), 0);
> {code}
>  
> But here the config key DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel() 
> will always returns empty string because it is constructed without config 
> label:
> {code:java}
> BLOCKS_PER_CHUNK("",
> new Option("blocksperchunk", true, "If set to a positive value, files"
> + "with more blocks than this value will be split into chunks of "
> + " blocks to be transferred in parallel, and "
> + "reassembled on the destination. By default,  is "
> + "0 and the files will be transmitted in their entirety without "
> + "splitting. This switch is only applicable when the source file "
> + "system implements getBlockLocations method and the target file "
> + "system implements concat method"))
> {code}
> As a result it will fall back to the default value 0 for blocksPerChunk, and 
> prevent the chunks from reassembling.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

2019-02-13 Thread Kai Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Xie updated HADOOP-16018:
-
Status: Open  (was: Patch Available)

> DistCp won't reassemble chunks when blocks per chunk > 0
> 
>
> Key: HADOOP-16018
> URL: https://issues.apache.org/jira/browse/HADOOP-16018
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 2.9.2, 3.2.0
>Reporter: Kai Xie
>Assignee: Kai Xie
>Priority: Major
> Fix For: 3.0.4, 3.2.1, 3.1.3
>
> Attachments: HADOOP-16018-002.patch, HADOOP-16018-branch-2-002.patch, 
> HADOOP-16018-branch-2-002.patch, HADOOP-16018-branch-2-003.patch, 
> HADOOP-16018-branch-2-004.patch, HADOOP-16018-branch-2-004.patch, 
> HADOOP-16018-branch-2-005.patch, HADOOP-16018-branch-2-005.patch, 
> HADOOP-16018-branch-2-006.patch, HADOOP-16018.01.patch
>
>
> I was investigating why hadoop-distcp-2.9.2 won't reassemble chunks of the 
> same file when blocks per chunk has been set > 0.
> In the CopyCommitter::commitJob, this logic can prevent chunks from 
> reassembling if blocks per chunk is equal to 0:
> {code:java}
> if (blocksPerChunk > 0) {
>   concatFileChunks(conf);
> }
> {code}
> Then in CopyCommitter's ctor, blocksPerChunk is initialised from the config:
> {code:java}
> blocksPerChunk = context.getConfiguration().getInt(
> DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel(), 0);
> {code}
>  
> But here the config key DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel() 
> will always returns empty string because it is constructed without config 
> label:
> {code:java}
> BLOCKS_PER_CHUNK("",
> new Option("blocksperchunk", true, "If set to a positive value, files"
> + "with more blocks than this value will be split into chunks of "
> + " blocks to be transferred in parallel, and "
> + "reassembled on the destination. By default,  is "
> + "0 and the files will be transmitted in their entirety without "
> + "splitting. This switch is only applicable when the source file "
> + "system implements getBlockLocations method and the target file "
> + "system implements concat method"))
> {code}
> As a result it will fall back to the default value 0 for blocksPerChunk, and 
> prevent the chunks from reassembling.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

2019-02-13 Thread Kai Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Xie updated HADOOP-16018:
-
Attachment: HADOOP-16018-branch-2-006.patch

> DistCp won't reassemble chunks when blocks per chunk > 0
> 
>
> Key: HADOOP-16018
> URL: https://issues.apache.org/jira/browse/HADOOP-16018
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.2.0, 2.9.2
>Reporter: Kai Xie
>Assignee: Kai Xie
>Priority: Major
> Fix For: 3.0.4, 3.2.1, 3.1.3
>
> Attachments: HADOOP-16018-002.patch, HADOOP-16018-branch-2-002.patch, 
> HADOOP-16018-branch-2-002.patch, HADOOP-16018-branch-2-003.patch, 
> HADOOP-16018-branch-2-004.patch, HADOOP-16018-branch-2-004.patch, 
> HADOOP-16018-branch-2-005.patch, HADOOP-16018-branch-2-005.patch, 
> HADOOP-16018-branch-2-006.patch, HADOOP-16018.01.patch
>
>
> I was investigating why hadoop-distcp-2.9.2 won't reassemble chunks of the 
> same file when blocks per chunk has been set > 0.
> In the CopyCommitter::commitJob, this logic can prevent chunks from 
> reassembling if blocks per chunk is equal to 0:
> {code:java}
> if (blocksPerChunk > 0) {
>   concatFileChunks(conf);
> }
> {code}
> Then in CopyCommitter's ctor, blocksPerChunk is initialised from the config:
> {code:java}
> blocksPerChunk = context.getConfiguration().getInt(
> DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel(), 0);
> {code}
>  
> But here the config key DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel() 
> will always returns empty string because it is constructed without config 
> label:
> {code:java}
> BLOCKS_PER_CHUNK("",
> new Option("blocksperchunk", true, "If set to a positive value, files"
> + "with more blocks than this value will be split into chunks of "
> + " blocks to be transferred in parallel, and "
> + "reassembled on the destination. By default,  is "
> + "0 and the files will be transmitted in their entirety without "
> + "splitting. This switch is only applicable when the source file "
> + "system implements getBlockLocations method and the target file "
> + "system implements concat method"))
> {code}
> As a result it will fall back to the default value 0 for blocksPerChunk, and 
> prevent the chunks from reassembling.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16105) WASB in secure mode does not set connectingUsingSAS

2019-02-13 Thread David McGinnis (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767295#comment-16767295
 ] 

David McGinnis commented on HADOOP-16105:
-

[~ste...@apache.org]: The fix itself looks fine. Thanks for taking care of this.

As for your testing, you mention above you aren't sure how best to test it. I 
am fairly certain there is a way to do a positive test on this that shows the 
fix works, but after putting in 3-4 hours a couple of nights ago on it, I 
wasn't able to get it exactly right. I was going to put more time in last night 
on this, but got tied up in other things. I'm including the test below as I 
have it now. I'm fairly certain this code (when tweaked to be correct of 
course) will give an Unauthorized exception without the change, and pass with 
the change. Feel free to use or not.

 

  @Test
  public void testConnectUsingSecureSASSuccess() throws Exception {
    // Create the test account with SAS credentials.
    Configuration conf = new Configuration();
    conf.setBoolean(AzureNativeFileSystemStore.KEY_USE_SECURE_MODE, true);
    testAccount = AzureBlobStorageTestAccount.create("",
    EnumSet.of(CreateOptions.UseSas, CreateOptions.CreateContainer),
    conf);
    assumeNotNull(testAccount);

    CloudBlobContainer container = testAccount.getRealContainer();
    AzureFileSystemInstrumentation instrumentation = new 
AzureFileSystemInstrumentation(conf);
    AzureNativeFileSystemStore store = new AzureNativeFileSystemStore();
    store.initialize(container.getUri(), conf, instrumentation);
    store.list("/", -1, -1);
  }

> WASB in secure mode does not set connectingUsingSAS
> ---
>
> Key: HADOOP-16105
> URL: https://issues.apache.org/jira/browse/HADOOP-16105
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 2.8.5, 3.1.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-16105-001.patch, HADOOP-16105-002.patch
>
>
> If you run WASB in secure mode, it doesn't set {{connectingUsingSAS}} to 
> true, which can break things



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16097) Provide proper documentation for FairCallQueue

2019-02-13 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767293#comment-16767293
 ] 

Erik Krogen commented on HADOOP-16097:
--

Thanks [~linyiqun]!

> Provide proper documentation for FairCallQueue
> --
>
> Key: HADOOP-16097
> URL: https://issues.apache.org/jira/browse/HADOOP-16097
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation, ipc
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: FairCallQueueGuide_Rendered.pdf, HADOOP-16097.000.patch, 
> HADOOP-16097.001.patch, HADOOP-16097.002.patch, faircallqueue-overview.png
>
>
> FairCallQueue, added in HADOOP-10282, doesn't seem to be well-documented 
> anywhere. Let's add in a new documentation for it and related components.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16049) DistCp result has data and checksum mismatch when blocks per chunk > 0

2019-02-13 Thread Kai Xie (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767290#comment-16767290
 ] 

Kai Xie commented on HADOOP-16049:
--

Thanks Steve for reviewing and merging the patch!

> DistCp result has data and checksum mismatch when blocks per chunk > 0
> --
>
> Key: HADOOP-16049
> URL: https://issues.apache.org/jira/browse/HADOOP-16049
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 2.9.2
>Reporter: Kai Xie
>Assignee: Kai Xie
>Priority: Major
> Fix For: 2.9.3
>
> Attachments: HADOOP-16049-branch-2-003.patch, 
> HADOOP-16049-branch-2-003.patch, HADOOP-16049-branch-2-004.patch, 
> HADOOP-16049-branch-2-005.patch
>
>
> In 2.9.2 RetriableFileCopyCommand.copyBytes,
> {code:java}
> int bytesRead = readBytes(inStream, buf, sourceOffset);
> while (bytesRead >= 0) {
>   ...
>   if (action == FileAction.APPEND) {
> sourceOffset += bytesRead;
>   }
>   ... // write to dst
>   bytesRead = readBytes(inStream, buf, sourceOffset);
> }{code}
> it does a positioned read but the position (`sourceOffset` here) is never 
> updated when blocks per chunk is set to > 0 (which always disables append 
> action). So for chunk with offset != 0, it will keep copying the first few 
> bytes again and again, causing result to have data & checksum mismatch.
> To re-produce this issue, in branch-2, update BLOCK_SIZE to 10240 (> default 
> copy buffer size) in class TestDistCpSystem and run it.
> HADOOP-15292 has resolved the issue reported in this ticket in 
> trunk/branch-3.1/branch-3.2 by not using the positioned read, but has not 
> been backported to branch-2 yet
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16093) Move DurationInfo from hadoop-aws to hadoop-common org.apache.fs.impl

2019-02-13 Thread Abhishek Modi (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767216#comment-16767216
 ] 

Abhishek Modi commented on HADOOP-16093:


[~ste...@apache.org] I have attached HADOOP-16093.005.patch addressing review 
comments. Could you please review. Thanks.

> Move DurationInfo from hadoop-aws to hadoop-common org.apache.fs.impl
> -
>
> Key: HADOOP-16093
> URL: https://issues.apache.org/jira/browse/HADOOP-16093
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, util
>Reporter: Steve Loughran
>Assignee: Abhishek Modi
>Priority: Minor
> Attachments: HADOOP-16093.001.patch, HADOOP-16093.002.patch, 
> HADOOP-16093.003.patch, HADOOP-16093.004.patch, HADOOP-16093.005.patch
>
>
> It'd be useful to have DurationInfo usable in other places (e.g. distcp, 
> abfs, ...). But as it is in hadoop-aws under 
> {{org.apache.hadoop.fs.s3a.commit.DurationInfo
> org.apache.hadoop.fs.s3a.commit.DurationInfo}} we can't do that
> Move it.
> We'll have to rename the Duration class in the process, as java 8 time has a 
> class of that name too. Maybe "OperationDuration", with DurationInfo a 
> subclass of that
> Probably need a test too, won't it?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16093) Move DurationInfo from hadoop-aws to hadoop-common org.apache.fs.impl

2019-02-13 Thread Abhishek Modi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Modi updated HADOOP-16093:
---
Attachment: HADOOP-16093.005.patch

> Move DurationInfo from hadoop-aws to hadoop-common org.apache.fs.impl
> -
>
> Key: HADOOP-16093
> URL: https://issues.apache.org/jira/browse/HADOOP-16093
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, util
>Reporter: Steve Loughran
>Assignee: Abhishek Modi
>Priority: Minor
> Attachments: HADOOP-16093.001.patch, HADOOP-16093.002.patch, 
> HADOOP-16093.003.patch, HADOOP-16093.004.patch, HADOOP-16093.005.patch
>
>
> It'd be useful to have DurationInfo usable in other places (e.g. distcp, 
> abfs, ...). But as it is in hadoop-aws under 
> {{org.apache.hadoop.fs.s3a.commit.DurationInfo
> org.apache.hadoop.fs.s3a.commit.DurationInfo}} we can't do that
> Move it.
> We'll have to rename the Duration class in the process, as java 8 time has a 
> class of that name too. Maybe "OperationDuration", with DurationInfo a 
> subclass of that
> Probably need a test too, won't it?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16107) LocalFileSystem doesn't wrap all create() or new builder calls; may skip CRC logic

2019-02-13 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-16107:

Status: Open  (was: Patch Available)

> LocalFileSystem doesn't wrap all create() or new builder calls; may skip CRC 
> logic
> --
>
> Key: HADOOP-16107
> URL: https://issues.apache.org/jira/browse/HADOOP-16107
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.3, 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-16107-001.patch
>
>
> LocalFS is a subclass of filterFS, but overrides create and open so that 
> checksums are created and read. 
> MAPREDUCE-7184 has thrown up that the new builder openFile() call is being 
> forwarded to the innerFS without CRC checking. Reviewing/fixing that has 
> shown that some of the create methods aren't being correctly wrapped, so not 
> generating CRCs
> * createFile() builder
> The following create calls
> {code}
>   public FSDataOutputStream createNonRecursive(final Path f,
>   final FsPermission permission,
>   final EnumSet flags,
>   final int bufferSize,
>   final short replication,
>   final long blockSize,
>   final Progressable progress) throws IOException;
>   public FSDataOutputStream create(final Path f,
>   final FsPermission permission,
>   final EnumSet flags,
>   final int bufferSize,
>   final short replication,
>   final long blockSize,
>   final Progressable progress,
>   final Options.ChecksumOpt checksumOpt) throws IOException {
> return super.create(f, permission, flags, bufferSize, replication,
> blockSize, progress, checksumOpt);
>   }
> {code}
> This means that applications using these methods, directly or indirectly to 
> create files aren't actually generating checksums.
> Fix: implement these methods & relay to local create calls, not to the inner 
> FS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15229) Add FileSystem builder-based openFile() API to match createFile(); S3A to implement S3 Select through this API.

2019-02-13 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767082#comment-16767082
 ] 

Steve Loughran commented on HADOOP-15229:
-

bq. Some trivial fix in javadoc is required

can you point me to where? I just did a javadoc of hadoop-common hadoop-aws 
mapreduce-client and hadoop-hdfs and didn't see anything unusual

> Add FileSystem builder-based openFile() API to match createFile(); S3A to 
> implement S3 Select through this API.
> ---
>
> Key: HADOOP-15229
> URL: https://issues.apache.org/jira/browse/HADOOP-15229
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15229-001.patch, HADOOP-15229-002.patch, 
> HADOOP-15229-003.patch, HADOOP-15229-004.patch, HADOOP-15229-004.patch, 
> HADOOP-15229-005.patch, HADOOP-15229-006.patch, HADOOP-15229-007.patch, 
> HADOOP-15229-009.patch, HADOOP-15229-010.patch, HADOOP-15229-011.patch, 
> HADOOP-15229-012.patch, HADOOP-15229-013.patch, HADOOP-15229-014.patch, 
> HADOOP-15229-015.patch, HADOOP-15229-016.patch, HADOOP-15229-017.patch, 
> HADOOP-15229-018.patch, HADOOP-15229-019.patch, HADOOP-15229-020.patch
>
>
> Replicate HDFS-1170 and HADOOP-14365 with an API to open files.
> A key requirement of this is not HDFS, it's to put in the fadvise policy for 
> working with object stores, where getting the decision to do a full GET and 
> TCP abort on seek vs smaller GETs is fundamentally different: the wrong 
> option can cost you minutes. S3A and Azure both have adaptive policies now 
> (first backward seek), but they still don't do it that well.
> Columnar formats (ORC, Parquet) should be able to say "fs.input.fadvise" 
> "random" as an option when they open files; I can imagine other options too.
> The Builder model of [~eddyxu] is the one to mimic, method for method. 
> Ideally with as much code reuse as possible



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16108) Tail Follow Interval Should Allow To Specify The Sleep Interval To Save Unnecessary RPC's

2019-02-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767075#comment-16767075
 ] 

Hudson commented on HADOOP-16108:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15946 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15946/])
HADOOP-16108. Tail Follow Interval Should Allow To Specify The Sleep 
(vinayakumarb: rev 00c5ffaee2fb16eaef512a47054c7b9ee7ea2e50)
* (edit) hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
* (add) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestTail.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Tail.java


> Tail Follow Interval Should Allow To Specify The Sleep Interval To Save 
> Unnecessary RPC's 
> --
>
> Key: HADOOP-16108
> URL: https://issues.apache.org/jira/browse/HADOOP-16108
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Harshakiran Reddy
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HADOOP-16108-01.patch, HDFS-14255-01.patch, 
> HDFS-14255-02.patch
>
>
> As of now tail -f follows every 5 seconds. We should allow a parameter to 
> specify this sleep interval. Linux has this configurable as in form of -s 
> parameter.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16108) Tail Follow Interval Should Allow To Specify The Sleep Interval To Save Unnecessary RPC's

2019-02-13 Thread Vinayakumar B (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HADOOP-16108:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.3
   3.2.1
   3.3.0
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-3.2 and branch-3.1

Thanks [~ayushtkn] and [~Harsha1206]

> Tail Follow Interval Should Allow To Specify The Sleep Interval To Save 
> Unnecessary RPC's 
> --
>
> Key: HADOOP-16108
> URL: https://issues.apache.org/jira/browse/HADOOP-16108
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Harshakiran Reddy
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HADOOP-16108-01.patch, HDFS-14255-01.patch, 
> HDFS-14255-02.patch
>
>
> As of now tail -f follows every 5 seconds. We should allow a parameter to 
> specify this sleep interval. Linux has this configurable as in form of -s 
> parameter.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15625) S3A input stream to use etags to detect changed source files

2019-02-13 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767032#comment-16767032
 ] 

Steve Loughran commented on HADOOP-15625:
-

ben: you've to the permissions to attach a patch now. Have a look at the 
testing.md doc for hadoop-aws to see the homework you get to do

At a quick look @ the patch

* you'll need to change that describe() statement
* I'd like to retain a test of larger -> shorter as well as shorter -> larger, 
because for openstack swift it was the first sequence which failed the most
* what exception gets raised? 

I'd like to have this fail with some special subclass of EOFException, i.e 
RemotFileChangedException or similar, because EOFExceptions are 
expected/processed specially in layers above. And we'll need to add some 
details in the troubleshooting.md file


> S3A input stream to use etags to detect changed source files
> 
>
> Key: HADOOP-15625
> URL: https://issues.apache.org/jira/browse/HADOOP-15625
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Major
> Attachments: HADOOP-15625-001.patch, HADOOP-15625-002.patch, 
> HADOOP-15625-003.patch
>
>
> S3A input stream doesn't handle changing source files any better than the 
> other cloud store connectors. Specifically: it doesn't noticed it has 
> changed, caches the length from startup, and whenever a seek triggers a new 
> GET, you may get one of: old data, new data, and even perhaps go from new 
> data to old data due to eventual consistency.
> We can't do anything to stop this, but we could detect changes by
> # caching the etag of the first HEAD/GET (we don't get that HEAD on open with 
> S3Guard, BTW)
> # on future GET requests, verify the etag of the response
> # raise an IOE if the remote file changed during the read.
> It's a more dramatic failure, but it stops changes silently corrupting things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16104) Wasb tests to downgrade to skip when test a/c is namespace enabled

2019-02-13 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767023#comment-16767023
 ] 

Steve Loughran commented on HADOOP-16104:
-

thanks. 

> Wasb tests to downgrade to skip when test a/c is namespace enabled
> --
>
> Key: HADOOP-16104
> URL: https://issues.apache.org/jira/browse/HADOOP-16104
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure, test
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Masatake Iwasaki
>Priority: Major
>
> When you run the abfs tests with a namespace-enabled accounts, all the wasb 
> tests fail "don't yet work with namespace-enabled accounts". This should be 
> downgraded to a test skip, somehow



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16107) LocalFileSystem doesn't wrap all create() or new builder calls; may skip CRC logic

2019-02-13 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767003#comment-16767003
 ] 

Steve Loughran commented on HADOOP-16107:
-

checkstyle is line length in a test
{code}
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestLocalFileSystem.java:924:
final Path file = new Path(TEST_ROOT_DIR, 
"testByteCountersThroughBuilders");: Line is longer than 80 characters (found 
81). [LineLength]
{code}
javac is an unavoidable deprecation use
{code}
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestLocalFileSystem.java:[789,43]
 [deprecation] getAllStatistics() in FileSystem has been deprecated
{code}

> LocalFileSystem doesn't wrap all create() or new builder calls; may skip CRC 
> logic
> --
>
> Key: HADOOP-16107
> URL: https://issues.apache.org/jira/browse/HADOOP-16107
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.3, 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-16107-001.patch
>
>
> LocalFS is a subclass of filterFS, but overrides create and open so that 
> checksums are created and read. 
> MAPREDUCE-7184 has thrown up that the new builder openFile() call is being 
> forwarded to the innerFS without CRC checking. Reviewing/fixing that has 
> shown that some of the create methods aren't being correctly wrapped, so not 
> generating CRCs
> * createFile() builder
> The following create calls
> {code}
>   public FSDataOutputStream createNonRecursive(final Path f,
>   final FsPermission permission,
>   final EnumSet flags,
>   final int bufferSize,
>   final short replication,
>   final long blockSize,
>   final Progressable progress) throws IOException;
>   public FSDataOutputStream create(final Path f,
>   final FsPermission permission,
>   final EnumSet flags,
>   final int bufferSize,
>   final short replication,
>   final long blockSize,
>   final Progressable progress,
>   final Options.ChecksumOpt checksumOpt) throws IOException {
> return super.create(f, permission, flags, bufferSize, replication,
> blockSize, progress, checksumOpt);
>   }
> {code}
> This means that applications using these methods, directly or indirectly to 
> create files aren't actually generating checksums.
> Fix: implement these methods & relay to local create calls, not to the inner 
> FS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16107) LocalFileSystem doesn't wrap all create() or new builder calls; may skip CRC logic

2019-02-13 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767001#comment-16767001
 ] 

Steve Loughran commented on HADOOP-16107:
-

its the junit fork timeout on MR again. I'll have to submit a patch which skips 
that MR test, but run the test locally to verify it is happy

> LocalFileSystem doesn't wrap all create() or new builder calls; may skip CRC 
> logic
> --
>
> Key: HADOOP-16107
> URL: https://issues.apache.org/jira/browse/HADOOP-16107
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.3, 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-16107-001.patch
>
>
> LocalFS is a subclass of filterFS, but overrides create and open so that 
> checksums are created and read. 
> MAPREDUCE-7184 has thrown up that the new builder openFile() call is being 
> forwarded to the innerFS without CRC checking. Reviewing/fixing that has 
> shown that some of the create methods aren't being correctly wrapped, so not 
> generating CRCs
> * createFile() builder
> The following create calls
> {code}
>   public FSDataOutputStream createNonRecursive(final Path f,
>   final FsPermission permission,
>   final EnumSet flags,
>   final int bufferSize,
>   final short replication,
>   final long blockSize,
>   final Progressable progress) throws IOException;
>   public FSDataOutputStream create(final Path f,
>   final FsPermission permission,
>   final EnumSet flags,
>   final int bufferSize,
>   final short replication,
>   final long blockSize,
>   final Progressable progress,
>   final Options.ChecksumOpt checksumOpt) throws IOException {
> return super.create(f, permission, flags, bufferSize, replication,
> blockSize, progress, checksumOpt);
>   }
> {code}
> This means that applications using these methods, directly or indirectly to 
> create files aren't actually generating checksums.
> Fix: implement these methods & relay to local create calls, not to the inner 
> FS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16102) FilterFileSystem does not implement getScheme

2019-02-13 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766999#comment-16766999
 ] 

Steve Loughran commented on HADOOP-16102:
-

HADOOP-16107 relates to this, as its about other non-overridden methods



> FilterFileSystem does not implement getScheme
> -
>
> Key: HADOOP-16102
> URL: https://issues.apache.org/jira/browse/HADOOP-16102
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Todd Owen
>Priority: Minor
>
> Calling {{getScheme}} on a {{FilterFileSystem}} throws 
> {{UnsupportedOperationException}}, which is the default provided by the base 
> class. Instead, it should return the scheme of the underlying ("filtered") 
> filesystem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org