[jira] [Commented] (HDFS-14267) Add test_libhdfs_ops to libhdfs tests, mark libhdfs_read/write.c as examples

2019-02-15 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769481#comment-16769481
 ] 

Sean Mackrory commented on HDFS-14267:
--

+1, LGTM. I need to run tests before I commit and there's some issue with my 
protobuf dependency I need to work through...

> Add test_libhdfs_ops to libhdfs tests, mark libhdfs_read/write.c as examples
> 
>
> Key: HDFS-14267
> URL: https://issues.apache.org/jira/browse/HDFS-14267
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: libhdfs, native, test
>Reporter: Sahil Takiar
>Priority: Major
> Attachments: HDFS-14267.001.patch, HDFS-14267.002.patch
>
>
> {{test_libhdfs_ops.c}} provides test coverage for basic operations against 
> libhdfs, but currently has to be run manually (e.g. {{mvn install}} does not 
> run these tests). The goal of this patch is to add {{test_libhdfs_ops.c}} to 
> the list of tests that are automatically run for libhdfs.
> It looks like {{test_libhdfs_ops.c}} was used in conjunction with 
> {{hadoop-hdfs-project/hadoop-hdfs/src/main/native/tests/test-libhdfs.sh}} to 
> run some tests against a mini DFS cluster. Now that the 
> {{NativeMiniDfsCluster}} exists, it makes more sense to use that rather than 
> rely on an external bash script to start a mini DFS cluster.
> The {{libhdfs-tests}} directory (which contains {{test_libhdfs_ops.c}}) 
> contains two other files: {{test_libhdfs_read.c}} and 
> {{test_libhdfs_write.c}}. At some point, these files might have been used in 
> conjunction with {{test-libhdfs.sh}} to run some tests manually. However, 
> they (1) largely overlap with the test coverage provided by 
> {{test_libhdfs_ops.c}} and (2) are not designed to be run as unit tests. Thus 
> I suggest we move these two files into a new folder called 
> {{libhdfs-examples}} and use them to further document how users of libhdfs 
> can use the API. We can move {{test-libhdfs.sh}} into the examples folder as 
> well given that example files probably require the script to actually work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14101) Random failure of testListCorruptFilesCorruptedBlock

2018-12-10 Thread Sean Mackrory (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-14101:
-
   Resolution: Fixed
Fix Version/s: 3.3.0
   Status: Resolved  (was: Patch Available)

> Random failure of testListCorruptFilesCorruptedBlock
> 
>
> Key: HDFS-14101
> URL: https://issues.apache.org/jira/browse/HDFS-14101
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.2.0, 3.0.3, 2.8.5
>Reporter: Kihwal Lee
>Assignee: Zsolt Venczel
>Priority: Major
>  Labels: newbie
> Fix For: 3.3.0
>
> Attachments: HDFS-14101.01.patch, HDFS-14101.02.patch
>
>
> We've seen this occasionally.
> {noformat}
> java.lang.IllegalArgumentException: Negative position
>   at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:755)
>   at org.apache.hadoop.hdfs.server.namenode.
>  
> TestListCorruptFileBlocks.testListCorruptFilesCorruptedBlock(TestListCorruptFileBlocks.java:105)
> {noformat}
> The test has a flaw.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14101) Random failure of testListCorruptFilesCorruptedBlock

2018-12-10 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714982#comment-16714982
 ] 

Sean Mackrory commented on HDFS-14101:
--

+1, thanks Zsolt. I think the shared variable is an improvement, but I still 
think it's worth a comment, so I'm gonna add the following add the declaration 
of corruptionLength if no one objects:

{code}
// Files are corrupted with 2 bytes before the end of the file,
// so that's the minimum length
{code}

Of course, as I understand it, there's still a 1:65,536 chance the corruption 
isn't detected because the random bytes we overwrite with are identical to 
those that were there originally, but this is a step in the right direction :)

> Random failure of testListCorruptFilesCorruptedBlock
> 
>
> Key: HDFS-14101
> URL: https://issues.apache.org/jira/browse/HDFS-14101
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.2.0, 3.0.3, 2.8.5
>Reporter: Kihwal Lee
>Assignee: Zsolt Venczel
>Priority: Major
>  Labels: newbie
> Attachments: HDFS-14101.01.patch, HDFS-14101.02.patch
>
>
> We've seen this occasionally.
> {noformat}
> java.lang.IllegalArgumentException: Negative position
>   at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:755)
>   at org.apache.hadoop.hdfs.server.namenode.
>  
> TestListCorruptFileBlocks.testListCorruptFilesCorruptedBlock(TestListCorruptFileBlocks.java:105)
> {noformat}
> The test has a flaw.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13744) OIV tool should better handle control characters present in file or directory names

2018-09-12 Thread Sean Mackrory (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-13744:
-
   Resolution: Fixed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

> OIV tool should better handle control characters present in file or directory 
> names
> ---
>
> Key: HDFS-13744
> URL: https://issues.apache.org/jira/browse/HDFS-13744
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, tools
>Affects Versions: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: HDFS-13744.01.patch, HDFS-13744.02.patch, 
> HDFS-13744.03.patch
>
>
> In certain cases when control characters or white space is present in file or 
> directory names OIV tool processors can export data in a misleading format.
> In the below examples we have EXAMPLE_NAME as a file and a directory name 
> where the directory has a line feed character at the end (the actual 
> production case has multiple line feeds and multiple spaces)
>  * Delimited processor case:
>  ** misleading example:
> {code:java}
> /user/data/EXAMPLE_NAME
> ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
>  * 
>  ** expected example as suggested by 
> [https://tools.ietf.org/html/rfc4180#section-2]:
> {code:java}
> "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31 
> 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
>  * XML processor case:
>  ** misleading example:
> {code:java}
> 479867791DIRECTORYEXAMPLE_NAME
> 1493033668294user:group:0775
> 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674
> {code}
>  * 
>  ** expected example as specified in 
> [https://www.w3.org/TR/REC-xml/#sec-line-ends]:
> {code:java}
> 479867791DIRECTORYEXAMPLE_NAME#xA1493033668294user:group:0775
> 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674
> {code}
>  * JSON:
>  The OIV Web Processor behaves correctly and produces the following:
> {code:java}
> {
>   "FileStatuses": {
> "FileStatus": [
>   {
> "fileId": 113632535,
> "accessTime": 1494954320141,
> "replication": 3,
> "owner": "user",
> "length": 520,
> "permission": "674",
> "blockSize": 134217728,
> "modificationTime": 1472205657504,
> "type": "FILE",
> "group": "group",
> "childrenNum": 0,
> "pathSuffix": "EXAMPLE_NAME"
>   },
>   {
> "fileId": 479867791,
> "accessTime": 0,
> "replication": 0,
> "owner": "user",
> "length": 0,
> "permission": "775",
> "blockSize": 0,
> "modificationTime": 1493033668294,
> "type": "DIRECTORY",
> "group": "group",
> "childrenNum": 0,
> "pathSuffix": "EXAMPLE_NAME\n"
>   }
> ]
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13744) OIV tool should better handle control characters present in file or directory names

2018-09-07 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607503#comment-16607503
 ] 

Sean Mackrory commented on HDFS-13744:
--

I concur. Committed.

> OIV tool should better handle control characters present in file or directory 
> names
> ---
>
> Key: HDFS-13744
> URL: https://issues.apache.org/jira/browse/HDFS-13744
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, tools
>Affects Versions: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Critical
> Attachments: HDFS-13744.01.patch, HDFS-13744.02.patch, 
> HDFS-13744.03.patch
>
>
> In certain cases when control characters or white space is present in file or 
> directory names OIV tool processors can export data in a misleading format.
> In the below examples we have EXAMPLE_NAME as a file and a directory name 
> where the directory has a line feed character at the end (the actual 
> production case has multiple line feeds and multiple spaces)
>  * Delimited processor case:
>  ** misleading example:
> {code:java}
> /user/data/EXAMPLE_NAME
> ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
>  * 
>  ** expected example as suggested by 
> [https://tools.ietf.org/html/rfc4180#section-2]:
> {code:java}
> "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31 
> 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
>  * XML processor case:
>  ** misleading example:
> {code:java}
> 479867791DIRECTORYEXAMPLE_NAME
> 1493033668294user:group:0775
> 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674
> {code}
>  * 
>  ** expected example as specified in 
> [https://www.w3.org/TR/REC-xml/#sec-line-ends]:
> {code:java}
> 479867791DIRECTORYEXAMPLE_NAME#xA1493033668294user:group:0775
> 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674
> {code}
>  * JSON:
>  The OIV Web Processor behaves correctly and produces the following:
> {code:java}
> {
>   "FileStatuses": {
> "FileStatus": [
>   {
> "fileId": 113632535,
> "accessTime": 1494954320141,
> "replication": 3,
> "owner": "user",
> "length": 520,
> "permission": "674",
> "blockSize": 134217728,
> "modificationTime": 1472205657504,
> "type": "FILE",
> "group": "group",
> "childrenNum": 0,
> "pathSuffix": "EXAMPLE_NAME"
>   },
>   {
> "fileId": 479867791,
> "accessTime": 0,
> "replication": 0,
> "owner": "user",
> "length": 0,
> "permission": "775",
> "blockSize": 0,
> "modificationTime": 1493033668294,
> "type": "DIRECTORY",
> "group": "group",
> "childrenNum": 0,
> "pathSuffix": "EXAMPLE_NAME\n"
>   }
> ]
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13744) OIV tool should better handle control characters present in file or directory names

2018-09-04 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603615#comment-16603615
 ] 

Sean Mackrory commented on HDFS-13744:
--

Looks good to me, except CR/LF was being escaped as LF, so I attached .003. 
with a trivial change that escapes both characters. If it's cool with you, I'll 
commit that version.

> OIV tool should better handle control characters present in file or directory 
> names
> ---
>
> Key: HDFS-13744
> URL: https://issues.apache.org/jira/browse/HDFS-13744
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, tools
>Affects Versions: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Critical
> Attachments: HDFS-13744.01.patch, HDFS-13744.02.patch, 
> HDFS-13744.03.patch
>
>
> In certain cases when control characters or white space is present in file or 
> directory names OIV tool processors can export data in a misleading format.
> In the below examples we have EXAMPLE_NAME as a file and a directory name 
> where the directory has a line feed character at the end (the actual 
> production case has multiple line feeds and multiple spaces)
>  * Delimited processor case:
>  ** misleading example:
> {code:java}
> /user/data/EXAMPLE_NAME
> ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
>  * 
>  ** expected example as suggested by 
> [https://tools.ietf.org/html/rfc4180#section-2]:
> {code:java}
> "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31 
> 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
>  * XML processor case:
>  ** misleading example:
> {code:java}
> 479867791DIRECTORYEXAMPLE_NAME
> 1493033668294user:group:0775
> 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674
> {code}
>  * 
>  ** expected example as specified in 
> [https://www.w3.org/TR/REC-xml/#sec-line-ends]:
> {code:java}
> 479867791DIRECTORYEXAMPLE_NAME#xA1493033668294user:group:0775
> 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674
> {code}
>  * JSON:
>  The OIV Web Processor behaves correctly and produces the following:
> {code:java}
> {
>   "FileStatuses": {
> "FileStatus": [
>   {
> "fileId": 113632535,
> "accessTime": 1494954320141,
> "replication": 3,
> "owner": "user",
> "length": 520,
> "permission": "674",
> "blockSize": 134217728,
> "modificationTime": 1472205657504,
> "type": "FILE",
> "group": "group",
> "childrenNum": 0,
> "pathSuffix": "EXAMPLE_NAME"
>   },
>   {
> "fileId": 479867791,
> "accessTime": 0,
> "replication": 0,
> "owner": "user",
> "length": 0,
> "permission": "775",
> "blockSize": 0,
> "modificationTime": 1493033668294,
> "type": "DIRECTORY",
> "group": "group",
> "childrenNum": 0,
> "pathSuffix": "EXAMPLE_NAME\n"
>   }
> ]
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13744) OIV tool should better handle control characters present in file or directory names

2018-09-04 Thread Sean Mackrory (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-13744:
-
Attachment: HDFS-13744.03.patch

> OIV tool should better handle control characters present in file or directory 
> names
> ---
>
> Key: HDFS-13744
> URL: https://issues.apache.org/jira/browse/HDFS-13744
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, tools
>Affects Versions: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Critical
> Attachments: HDFS-13744.01.patch, HDFS-13744.02.patch, 
> HDFS-13744.03.patch
>
>
> In certain cases when control characters or white space is present in file or 
> directory names OIV tool processors can export data in a misleading format.
> In the below examples we have EXAMPLE_NAME as a file and a directory name 
> where the directory has a line feed character at the end (the actual 
> production case has multiple line feeds and multiple spaces)
>  * Delimited processor case:
>  ** misleading example:
> {code:java}
> /user/data/EXAMPLE_NAME
> ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
>  * 
>  ** expected example as suggested by 
> [https://tools.ietf.org/html/rfc4180#section-2]:
> {code:java}
> "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31 
> 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
>  * XML processor case:
>  ** misleading example:
> {code:java}
> 479867791DIRECTORYEXAMPLE_NAME
> 1493033668294user:group:0775
> 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674
> {code}
>  * 
>  ** expected example as specified in 
> [https://www.w3.org/TR/REC-xml/#sec-line-ends]:
> {code:java}
> 479867791DIRECTORYEXAMPLE_NAME#xA1493033668294user:group:0775
> 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674
> {code}
>  * JSON:
>  The OIV Web Processor behaves correctly and produces the following:
> {code:java}
> {
>   "FileStatuses": {
> "FileStatus": [
>   {
> "fileId": 113632535,
> "accessTime": 1494954320141,
> "replication": 3,
> "owner": "user",
> "length": 520,
> "permission": "674",
> "blockSize": 134217728,
> "modificationTime": 1472205657504,
> "type": "FILE",
> "group": "group",
> "childrenNum": 0,
> "pathSuffix": "EXAMPLE_NAME"
>   },
>   {
> "fileId": 479867791,
> "accessTime": 0,
> "replication": 0,
> "owner": "user",
> "length": 0,
> "permission": "775",
> "blockSize": 0,
> "modificationTime": 1493033668294,
> "type": "DIRECTORY",
> "group": "group",
> "childrenNum": 0,
> "pathSuffix": "EXAMPLE_NAME\n"
>   }
> ]
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13744) OIV tool should better handle control characters present in file or directory names

2018-08-22 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588981#comment-16588981
 ] 

Sean Mackrory commented on HDFS-13744:
--

We should probably also handle StringUtils.CR. Hadoop is sometimes used from 
Windows clients too.

I'm a little bit torn about not escaping the XML. If someone is embedding 
control characters in filenames, even if that is technically allowed and there 
are standards specifying how that is to be encoded / decoded, I think it's 
likely to cause problems, and I would want those characters to show up 
obviously in a report. I suspect there's a good chance that those characters 
are the reason someone is trying to inspect the image in the first place :) But 
I also don't want to cause practical problems in XML parsers. I can see an 
argument either way - like I said I'm a bit torn and want to think about it...

> OIV tool should better handle control characters present in file or directory 
> names
> ---
>
> Key: HDFS-13744
> URL: https://issues.apache.org/jira/browse/HDFS-13744
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, tools
>Affects Versions: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Critical
> Attachments: HDFS-13744.01.patch
>
>
> In certain cases when control characters or white space is present in file or 
> directory names OIV tool processors can export data in a misleading format.
> In the below examples we have EXAMPLE_NAME as a file and a directory name 
> where the directory has a line feed character at the end (the actual 
> production case has multiple line feeds and multiple spaces)
>  * Delimited processor case:
>  ** misleading example:
> {code:java}
> /user/data/EXAMPLE_NAME
> ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
>  * 
>  ** expected example as suggested by 
> [https://tools.ietf.org/html/rfc4180#section-2]:
> {code:java}
> "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31 
> 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
>  * XML processor case:
>  ** misleading example:
> {code:java}
> 479867791DIRECTORYEXAMPLE_NAME
> 1493033668294user:group:0775
> 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674
> {code}
>  * 
>  ** expected example as specified in 
> [https://www.w3.org/TR/REC-xml/#sec-line-ends]:
> {code:java}
> 479867791DIRECTORYEXAMPLE_NAME#xA1493033668294user:group:0775
> 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674
> {code}
>  * JSON:
>  The OIV Web Processor behaves correctly and produces the following:
> {code:java}
> {
>   "FileStatuses": {
> "FileStatus": [
>   {
> "fileId": 113632535,
> "accessTime": 1494954320141,
> "replication": 3,
> "owner": "user",
> "length": 520,
> "permission": "674",
> "blockSize": 134217728,
> "modificationTime": 1472205657504,
> "type": "FILE",
> "group": "group",
> "childrenNum": 0,
> "pathSuffix": "EXAMPLE_NAME"
>   },
>   {
> "fileId": 479867791,
> "accessTime": 0,
> "replication": 0,
> "owner": "user",
> "length": 0,
> "permission": "775",
> "blockSize": 0,
> "modificationTime": 1493033668294,
> "type": "DIRECTORY",
> "group": "group",
> "childrenNum": 0,
> "pathSuffix": "EXAMPLE_NAME\n"
>   }
> ]
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13486) Backport HDFS-11817 (A faulty node can cause a lease leak and NPE on accessing data) to branch-2.7

2018-07-20 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551376#comment-16551376
 ] 

Sean Mackrory commented on HDFS-13486:
--

This can cause a similar failure to HDFS-7524 if you backport this without 
backporting HDFS-12299.

> Backport HDFS-11817 (A faulty node can cause a lease leak and NPE on 
> accessing data) to branch-2.7
> --
>
> Key: HDFS-13486
> URL: https://issues.apache.org/jira/browse/HDFS-13486
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 2.7.7
>
> Attachments: HDFS-11817.branch-2.7.001.patch, 
> HDFS-11817.branch-2.7.002.patch
>
>
> HDFS-11817 is a good fix to have in branch-2.7.
> I'm taking a stab at it now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13582) Improve backward compatibility for HDFS-13176 (WebHdfs file path gets truncated when having semicolon (;) inside)

2018-06-15 Thread Sean Mackrory (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-13582:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed your branch-2 patch. Thanks, [~zvenczel]!

> Improve backward compatibility for HDFS-13176 (WebHdfs file path gets 
> truncated when having semicolon (;) inside)
> -
>
> Key: HDFS-13582
> URL: https://issues.apache.org/jira/browse/HDFS-13582
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HDFS-13582-branch-2.01.patch, HDFS-13582.01.patch, 
> HDFS-13582.02.patch
>
>
> Encode special character only if necessary in order to improve backward 
> compatibility in the following scenario:
> new (having HDFS-13176) WebHdfs client - > old (not having HDFS-13176) 
> WebHdfs server 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13582) Improve backward compatibility for HDFS-13176 (WebHdfs file path gets truncated when having semicolon (;) inside)

2018-05-31 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497543#comment-16497543
 ] 

Sean Mackrory commented on HDFS-13582:
--

Integration failures appear to be protoc version related again. I've pushed 
this fix to branch-3.1 since the original fix that introduced the 
incompatibility was also there. We also put that original commit in branch-2. 
[~zvenczel] - do you wanna post a version of this patch for branch-2 as well?

> Improve backward compatibility for HDFS-13176 (WebHdfs file path gets 
> truncated when having semicolon (;) inside)
> -
>
> Key: HDFS-13582
> URL: https://issues.apache.org/jira/browse/HDFS-13582
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HDFS-13582.01.patch, HDFS-13582.02.patch
>
>
> Encode special character only if necessary in order to improve backward 
> compatibility in the following scenario:
> new (having HDFS-13176) WebHdfs client - > old (not having HDFS-13176) 
> WebHdfs server 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside

2018-05-31 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497542#comment-16497542
 ] 

Sean Mackrory commented on HDFS-13176:
--

Ok - I've gone ahead and pushed the followup fix to branch-3.1.

> WebHdfs file path gets truncated when having semicolon (;) inside
> -
>
> Key: HDFS-13176
> URL: https://issues.apache.org/jira/browse/HDFS-13176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Fix For: 2.10.0, 3.2.0
>
> Attachments: HDFS-13176-branch-2.01.patch, 
> HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, 
> HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, 
> HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.04.patch, 
> HDFS-13176-branch-2_yetus.log, HDFS-13176.01.patch, HDFS-13176.02.patch, 
> TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch
>
>
> Find attached a patch having a test case that tries to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside

2018-05-31 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496640#comment-16496640
 ] 

Sean Mackrory commented on HDFS-13176:
--

Note that the above failure is actually in reference to HDFS-13582 and not 
HDFS-13176 (this one). The failure appears to be because our protoc versions 
are messed up again.

[~wangda] - I owe you an apology. I looked in the git-log to confirm where else 
I needed to put the HDFS-13582 patch that addresses this incompatibility. It 
appears that I committed this to branch-3.1 after code freeze (and I don't know 
why I did - but it definitely looks like I was the one that did it) before the 
3.1.0 release. Would you prefer me to commit the HDFS-13582 fix to branch-3.1 
for 3.1.0, or simply revert this HDFS-13176 patch that should never have been 
included in the first place? Let me know and I'll gladly take care of it. Again 
- my apologies!

> WebHdfs file path gets truncated when having semicolon (;) inside
> -
>
> Key: HDFS-13176
> URL: https://issues.apache.org/jira/browse/HDFS-13176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Fix For: 2.10.0, 3.2.0
>
> Attachments: HDFS-13176-branch-2.01.patch, 
> HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, 
> HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, 
> HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.04.patch, 
> HDFS-13176-branch-2_yetus.log, HDFS-13176.01.patch, HDFS-13176.02.patch, 
> TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch
>
>
> Find attached a patch having a test case that tries to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13121) NPE when request file descriptors when SC read

2018-05-25 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490768#comment-16490768
 ] 

Sean Mackrory commented on HDFS-13121:
--

I had an offline conversation with [~zvenczel] about the lack of tests - and 
testing this requires a pretty unreasonable level of refactoring and / or 
introducing new dependencies to do the mocking. One piece of feedback, though, 
is that I'd like to see a more helpful error message. The stack trace could 
show them that one of those fields was null, so if the text we pass to the 
exception could include, "This is often because Hadoop has exceeded the allowed 
number of open file descriptors" or something like that that hints at the 
likely root cause and possible solution would be good.

> NPE when request file descriptors when SC read
> --
>
> Key: HDFS-13121
> URL: https://issues.apache.org/jira/browse/HDFS-13121
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.0.0
>Reporter: Gang Xie
>Assignee: Zsolt Venczel
>Priority: Minor
> Attachments: HDFS-13121.01.patch
>
>
> Recently, we hit an issue that the DFSClient throws NPE. The case is that, 
> the app process exceeds the limit of the max open file. In the case, the 
> libhadoop never throw and exception but return null to the request of fds. 
> But requestFileDescriptors use the returned fds directly without any check 
> and then NPE. 
>  
> We need add a sanity check here of null pointer.
>  
> private ShortCircuitReplicaInfo requestFileDescriptors(DomainPeer peer,
>  Slot slot) throws IOException {
>  ShortCircuitCache cache = clientContext.getShortCircuitCache();
>  final DataOutputStream out =
>  new DataOutputStream(new BufferedOutputStream(peer.getOutputStream()));
>  SlotId slotId = slot == null ? null : slot.getSlotId();
>  new Sender(out).requestShortCircuitFds(block, token, slotId, 1,
>  failureInjector.getSupportsReceiptVerification());
>  DataInputStream in = new DataInputStream(peer.getInputStream());
>  BlockOpResponseProto resp = BlockOpResponseProto.parseFrom(
>  PBHelperClient.vintPrefixed(in));
>  DomainSocket sock = peer.getDomainSocket();
>  failureInjector.injectRequestFileDescriptorsFailure();
>  switch (resp.getStatus()) {
>  case SUCCESS:
>  byte buf[] = new byte[1];
>  FileInputStream[] fis = new FileInputStream[2];
>  {color:#d04437}sock.recvFileInputStreams(fis, buf, 0, buf.length);{color}
>  ShortCircuitReplica replica = null;
>  try {
>  ExtendedBlockId key =
>  new ExtendedBlockId(block.getBlockId(), block.getBlockPoolId());
>  if (buf[0] == USE_RECEIPT_VERIFICATION.getNumber()) {
>  LOG.trace("Sending receipt verification byte for slot {}", slot);
>  sock.getOutputStream().write(0);
>  }
>  {color:#d04437}replica = new ShortCircuitReplica(key, fis[0], fis[1], 
> cache,{color}
> {color:#d04437} Time.monotonicNow(), slot);{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13582) Improve backward compatibility for HDFS-13176 (WebHdfs file path gets truncated when having semicolon (;) inside)

2018-05-25 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490717#comment-16490717
 ] 

Sean Mackrory commented on HDFS-13582:
--

Thanks for fixing that checkstyle issue. I think this is the right approach for 
what is kind of a messy situation: essentially we only modify paths that would 
have failed before, and use the previous behavior for all other paths, so any 
incompatibility between versions is only modifying how an existing problem 
manifests.

 

+1 - will commit shortly.

> Improve backward compatibility for HDFS-13176 (WebHdfs file path gets 
> truncated when having semicolon (;) inside)
> -
>
> Key: HDFS-13582
> URL: https://issues.apache.org/jira/browse/HDFS-13582
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HDFS-13582.01.patch, HDFS-13582.02.patch
>
>
> Encode special character only if necessary in order to improve backward 
> compatibility in the following scenario:
> new (having HDFS-13176) WebHdfs client - > old (not having HDFS-13176) 
> WebHdfs server 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside

2018-04-06 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-13176:
-
   Resolution: Fixed
Fix Version/s: 3.2.0
   2.10.0
   Status: Resolved  (was: Patch Available)

> WebHdfs file path gets truncated when having semicolon (;) inside
> -
>
> Key: HDFS-13176
> URL: https://issues.apache.org/jira/browse/HDFS-13176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Fix For: 2.10.0, 3.2.0
>
> Attachments: HDFS-13176-branch-2.01.patch, 
> HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, 
> HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, 
> HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.04.patch, 
> HDFS-13176-branch-2_yetus.log, HDFS-13176.01.patch, HDFS-13176.02.patch, 
> TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch
>
>
> Find attached a patch having a test case that tries to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside

2018-04-06 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428344#comment-16428344
 ] 

Sean Mackrory commented on HDFS-13176:
--

+1 and committed to branch-2. Also ran Yetus, etc. myself.

> WebHdfs file path gets truncated when having semicolon (;) inside
> -
>
> Key: HDFS-13176
> URL: https://issues.apache.org/jira/browse/HDFS-13176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13176-branch-2.01.patch, 
> HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, 
> HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, 
> HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.04.patch, 
> HDFS-13176-branch-2_yetus.log, HDFS-13176.01.patch, HDFS-13176.02.patch, 
> TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch
>
>
> Find attached a patch having a test case that tries to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside

2018-03-07 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390173#comment-16390173
 ] 

Sean Mackrory commented on HDFS-13176:
--

I've pushed this to trunk. This would be good in branch-2, as well. It has a 
fairly trivial conflict thought, if you want to post a patch applied to 
branch-2 I can commit it - am about to run out for an errand.

> WebHdfs file path gets truncated when having semicolon (;) inside
> -
>
> Key: HDFS-13176
> URL: https://issues.apache.org/jira/browse/HDFS-13176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13176.01.patch, HDFS-13176.02.patch, 
> TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch
>
>
> Find attached a patch having a test case that tries to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside

2018-03-06 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388560#comment-16388560
 ] 

Sean Mackrory commented on HDFS-13176:
--

In light of other ascii / unicode characters being legal, I added everything I 
could from a standard keyboard to the test, still passes. Sound good to you, 
[~zvenczel]?

> WebHdfs file path gets truncated when having semicolon (;) inside
> -
>
> Key: HDFS-13176
> URL: https://issues.apache.org/jira/browse/HDFS-13176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13176.01.patch, HDFS-13176.02.patch, 
> TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch
>
>
> Find attached a patch having a test case that tries to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside

2018-03-06 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-13176:
-
Attachment: HDFS-13176.02.patch

> WebHdfs file path gets truncated when having semicolon (;) inside
> -
>
> Key: HDFS-13176
> URL: https://issues.apache.org/jira/browse/HDFS-13176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13176.01.patch, HDFS-13176.02.patch, 
> TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch
>
>
> Find attached a patch having a test case that tries to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside

2018-03-06 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388339#comment-16388339
 ] 

Sean Mackrory edited comment on HDFS-13176 at 3/6/18 9:13 PM:
--

Thanks to [~anu] for digging up the appropriate documentation of legal paths - 
I had missed it looking at HDFS-specific stuff, but the documentation is 
Common-wide, which makes sense. The gist is that ':' is indeed illegal, but all 
of the other characters you're testing and the others I mentioned (that I 
didn't want to support) are supposed to work. +1 to your fix - I'll commit it 
but will give another day in case the folks you tagged want to chime in. We may 
as well add other characters that are supposed to work to verify they do and 
help keep it that way, too.

 

edit: for context, this is what Anu linked to on the mailing list thread: 
http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/filesystem/introduction.html#Path_Names


was (Author: mackrorysd):
Thanks to [~anu] for digging up the appropriate documentation of legal paths - 
I had missed it looking at HDFS-specific stuff, but the documentation is 
Common-wide, which makes sense. The gist is that ':' is indeed illegal, but all 
of the other characters you're testing and the others I mentioned (that I 
didn't want to support) are supposed to work. +1 to your fix - I'll commit it 
but will give another day in case the folks you tagged want to chime in. We may 
as well add other characters that are supposed to work to verify they do and 
help keep it that way, too.

> WebHdfs file path gets truncated when having semicolon (;) inside
> -
>
> Key: HDFS-13176
> URL: https://issues.apache.org/jira/browse/HDFS-13176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13176.01.patch, 
> TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch
>
>
> Find attached a patch having a test case that tries to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside

2018-03-06 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388339#comment-16388339
 ] 

Sean Mackrory commented on HDFS-13176:
--

Thanks to [~anu] for digging up the appropriate documentation of legal paths - 
I had missed it looking at HDFS-specific stuff, but the documentation is 
Common-wide, which makes sense. The gist is that ':' is indeed illegal, but all 
of the other characters you're testing and the others I mentioned (that I 
didn't want to support) are supposed to work. +1 to your fix - I'll commit it 
but will give another day in case the folks you tagged want to chime in. We may 
as well add other characters that are supposed to work to verify they do and 
help keep it that way, too.

> WebHdfs file path gets truncated when having semicolon (;) inside
> -
>
> Key: HDFS-13176
> URL: https://issues.apache.org/jira/browse/HDFS-13176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13176.01.patch, 
> TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch
>
>
> Find attached a patch having a test case that tries to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside

2018-03-05 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386917#comment-16386917
 ] 

Sean Mackrory edited comment on HDFS-13176 at 3/5/18 11:15 PM:
---

At first glance, I love that this is addressing URL-specific weirdness in 
WebHDFS specifically and isn't modifying Path for the whole world. I'll do a 
deeper code review and make sure I do agree with everything, but I strongly 
suspect this is indeed the right way to solve this.

One of the outcomes of this Jira I'd love to see is that we do finally clarify 
which paths are legal paths. ?, %, and \ make me nervous for the potential for 
them to be misinterpreted by related tools or future classes because of their 
role in URLs and Windows paths (which is exactly why : wouldn't work). There 
are a few other characters that I would personally avoid because of their role 
in scripts (I would have similar concerns about ~, *, `, @, !, $), etc, but I'd 
feel more comfortable agreeing to support the following subset of what you're 
currently testing: \"()[]_-=&+;{}#. Anyone else have thoughts along these lines?


was (Author: mackrorysd):
At first glance, I love that this is addressing URL-specific weirdness in 
WebHDFS specifically and isn't modifying Path for the whole world. I'll do a 
deeper code review and make sure I do agree with everything, but I strongly 
suspect this is indeed the right way to solve this.

One of the outcomes of this Jira I'd love to see is that we do finally clarify 
which paths are legal paths. ?, %, and \\ make me nervous for the potential for 
them to be misinterpreted by related tools or future classes because of their 
role in URLs and Windows paths (which is exactly why : wouldn't work). There 
are a few other characters that I would personally avoid because of their role 
in scripts (I would have similar concerns about ~, *, `, @, !, $), etc, but I'd 
feel more comfortable agreeing to support the following subset of what you're 
currently testing: \"()[]_-=&+;{}#. Anyone else have thoughts along these lines?

> WebHdfs file path gets truncated when having semicolon (;) inside
> -
>
> Key: HDFS-13176
> URL: https://issues.apache.org/jira/browse/HDFS-13176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13176.01.patch, 
> TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch
>
>
> Find attached a patch having a test case that tries to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside

2018-03-05 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386917#comment-16386917
 ] 

Sean Mackrory commented on HDFS-13176:
--

At first glance, I love that this is addressing URL-specific weirdness in 
WebHDFS specifically and isn't modifying Path for the whole world. I'll do a 
deeper code review and make sure I do agree with everything, but I strongly 
suspect this is indeed the right way to solve this.

One of the outcomes of this Jira I'd love to see is that we do finally clarify 
which paths are legal paths. ?, %, and \\ make me nervous for the potential for 
them to be misinterpreted by related tools or future classes because of their 
role in URLs and Windows paths (which is exactly why : wouldn't work). There 
are a few other characters that I would personally avoid because of their role 
in scripts (I would have similar concerns about ~, *, `, @, !, $), etc, but I'd 
feel more comfortable agreeing to support the following subset of what you're 
currently testing: \"()[]_-=&+;{}#. Anyone else have thoughts along these lines?

> WebHdfs file path gets truncated when having semicolon (;) inside
> -
>
> Key: HDFS-13176
> URL: https://issues.apache.org/jira/browse/HDFS-13176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13176.01.patch, 
> TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch
>
>
> Find attached a patch having a test case that tries to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside

2018-02-21 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371738#comment-16371738
 ] 

Sean Mackrory commented on HDFS-13176:
--

If we can fix this without breaking anything else, great, because we know there 
are people doing this. Personally, I would avoid using characters like this at 
all costs because of how many parsing problems they might cause in scripts, 
encodings, whatever. But the fact is I don't think we've ever been clear about 
what makes a legal path. We often compare it to POSIX, which I think allows 
anything printable except that / has a special meaning, but it's also needing 
to work in browsers as a URL, which may make for some interesting rules. I've 
had a look through a few sources and I don't see any definition of what should 
work - and we should probably come up with a set definition of what paths are 
legal and supported.

> WebHdfs file path gets truncated when having semicolon (;) inside
> -
>
> Key: HDFS-13176
> URL: https://issues.apache.org/jira/browse/HDFS-13176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch
>
>
> Find attached a patch having a test case that tries to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13135) Lease not deleted when deleting INodeReference

2018-02-12 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361465#comment-16361465
 ] 

Sean Mackrory commented on HDFS-13135:
--

So this test more or less reproduces what I was seeing. I'm still trying to get 
more info about the workload that did this, because it seems insane, but the 
same RPC client ID was appending to a file, and then deleting it, and upon 
starting back up we got a NullPointerException because there was a lease for an 
inode that didn't exist anymore.

I'm uncertain about whether or not the fix is correct here: a lot of the code 
this is calling is completely new to me, so it's entirely possible there are 
side effects I haven't considered (like whether or not this causes data that 
should not be cleaned up because it's needed for the s0 snapshot to get 
deleted).

> Lease not deleted when deleting INodeReference
> --
>
> Key: HDFS-13135
> URL: https://issues.apache.org/jira/browse/HDFS-13135
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>Priority: Major
> Attachments: HDFS-13135.001.patch
>
>
> In troubleshooting an occurrence of HDFS-13115, it seemed that there was 
> another underlying root cause that should also be addressed. There was an 
> INodeReference that was deleted and the lease on it was not subsequently 
> deleted because it was never added to the reclaim context.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13135) Lease not deleted when deleting INodeReference

2018-02-12 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-13135:
-
Attachment: HDFS-13135.001.patch

> Lease not deleted when deleting INodeReference
> --
>
> Key: HDFS-13135
> URL: https://issues.apache.org/jira/browse/HDFS-13135
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>Priority: Major
> Attachments: HDFS-13135.001.patch
>
>
> In troubleshooting an occurrence of HDFS-13115, it seemed that there was 
> another underlying root cause that should also be addressed. There was an 
> INodeReference that was deleted and the lease on it was not subsequently 
> deleted because it was never added to the reclaim context.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13135) Lease not deleted when deleting INodeReference

2018-02-12 Thread Sean Mackrory (JIRA)
Sean Mackrory created HDFS-13135:


 Summary: Lease not deleted when deleting INodeReference
 Key: HDFS-13135
 URL: https://issues.apache.org/jira/browse/HDFS-13135
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Sean Mackrory
Assignee: Sean Mackrory


In troubleshooting an occurrence of HDFS-13115, it seemed that there was 
another underlying root cause that should also be addressed. There was an 
INodeReference that was deleted and the lease on it was not subsequently 
deleted because it was never added to the reclaim context.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13106) Need to exercise all HDFS APIs for EC

2018-02-06 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-13106:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Need to exercise all HDFS APIs for EC
> -
>
> Key: HDFS-13106
> URL: https://issues.apache.org/jira/browse/HDFS-13106
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Haibo Yan
>Assignee: Haibo Yan
>Priority: Major
> Attachments: HDFS-13106.001.patch, HDFS-13106.002.patch, 
> HDFS-13106.003.patch
>
>
> Exercise FileSystem API to make sure all APIs works as expected under Erasure 
> Coding feature enabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13106) Need to exercise all HDFS APIs for EC

2018-02-06 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354767#comment-16354767
 ] 

Sean Mackrory commented on HDFS-13106:
--

Committed to trunk.

> Need to exercise all HDFS APIs for EC
> -
>
> Key: HDFS-13106
> URL: https://issues.apache.org/jira/browse/HDFS-13106
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Haibo Yan
>Assignee: Haibo Yan
>Priority: Major
> Attachments: HDFS-13106.001.patch, HDFS-13106.002.patch, 
> HDFS-13106.003.patch
>
>
> Exercise FileSystem API to make sure all APIs works as expected under Erasure 
> Coding feature enabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13106) Need to exercise all HDFS APIs for EC

2018-02-06 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354728#comment-16354728
 ] 

Sean Mackrory commented on HDFS-13106:
--

+1 from me too. Unit test failures are unrelated, new tests work for me locally 
too. Can commit shortly.

> Need to exercise all HDFS APIs for EC
> -
>
> Key: HDFS-13106
> URL: https://issues.apache.org/jira/browse/HDFS-13106
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Haibo Yan
>Assignee: Haibo Yan
>Priority: Major
> Attachments: HDFS-13106.001.patch, HDFS-13106.002.patch, 
> HDFS-13106.003.patch
>
>
> Exercise FileSystem API to make sure all APIs works as expected under Erasure 
> Coding feature enabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13106) Need to exercise all HDFS APIs for EC

2018-02-05 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16353187#comment-16353187
 ] 

Sean Mackrory commented on HDFS-13106:
--

The HDFS test failures are unrelated: you only add tests, and those tests 
aren't the ones failing. We should address the checkstyle issues, though.

> Need to exercise all HDFS APIs for EC
> -
>
> Key: HDFS-13106
> URL: https://issues.apache.org/jira/browse/HDFS-13106
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Haibo Yan
>Assignee: Haibo Yan
>Priority: Major
> Attachments: HDFS-13106.001.patch
>
>
> Exercise FileSystem API to make sure all APIs works as expected under Erasure 
> Coding feature enabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13106) Need to exercise all HDFS APIs for EC

2018-02-05 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16353174#comment-16353174
 ] 

Sean Mackrory commented on HDFS-13106:
--

Looks good to me. We need a clean Yetus run, though - I suspect it needs your 
patch to be named HDFS-13106*.001*.patch. I'm also not much of an EC expert, so 
I wouldn't notice if there were any special edge cases that deserved special 
attention here, but I think this is safe enough to commit once we have a clean 
Yetus run.

> Need to exercise all HDFS APIs for EC
> -
>
> Key: HDFS-13106
> URL: https://issues.apache.org/jira/browse/HDFS-13106
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Haibo Yan
>Assignee: Haibo Yan
>Priority: Major
> Attachments: HDFS-13106.001.patch
>
>
> Exercise FileSystem API to make sure all APIs works as expected under Erasure 
> Coding feature enabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12913) TestDNFencingWithReplication.testFencingStress fix mini cluster not yet active issue

2018-01-04 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-12913:
-
   Resolution: Fixed
Fix Version/s: 3.0.1
   3.1.0
   Status: Resolved  (was: Patch Available)

> TestDNFencingWithReplication.testFencingStress fix mini cluster not yet 
> active issue
> 
>
> Key: HDFS-12913
> URL: https://issues.apache.org/jira/browse/HDFS-12913
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>  Labels: flaky-test
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-12913.01.patch, HDFS-12913.02.patch
>
>
> Once in every 5000 test run the following issue happens:
> {code}
> 2017-12-11 10:33:09 [INFO] 
> 2017-12-11 10:33:09 [INFO] 
> ---
> 2017-12-11 10:33:09 [INFO]  T E S T S
> 2017-12-11 10:33:09 [INFO] 
> ---
> 2017-12-11 10:33:09 [INFO] Running 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
> 2017-12-11 10:37:32 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 262.641 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
> 2017-12-11 10:37:32 [ERROR] 
> testFencingStress(org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication)
>   Time elapsed: 262.477 s  <<< ERROR!
> 2017-12-11 10:37:32 java.lang.RuntimeException: Deferred
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.test.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:130)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.test.MultithreadedTestUtil$TestContext.stop(MultithreadedTestUtil.java:166)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress(TestDNFencingWithReplication.java:137)
> 2017-12-11 10:37:32   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> 2017-12-11 10:37:32   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2017-12-11 10:37:32   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2017-12-11 10:37:32   at java.lang.reflect.Method.invoke(Method.java:498)
> 2017-12-11 10:37:32   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 2017-12-11 10:37:32   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2017-12-11 10:37:32   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 2017-12-11 10:37:32   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> 2017-12-11 10:37:32   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> 2017-12-11 10:37:32   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)
> 2017-12-11 10:37:32 Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported 

[jira] [Commented] (HDFS-12913) TestDNFencingWithReplication.testFencingStress fix mini cluster not yet active issue

2018-01-04 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311468#comment-16311468
 ] 

Sean Mackrory commented on HDFS-12913:
--

+1 - will merge today.

> TestDNFencingWithReplication.testFencingStress fix mini cluster not yet 
> active issue
> 
>
> Key: HDFS-12913
> URL: https://issues.apache.org/jira/browse/HDFS-12913
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>  Labels: flaky-test
> Attachments: HDFS-12913.01.patch, HDFS-12913.02.patch
>
>
> Once in every 5000 test run the following issue happens:
> {code}
> 2017-12-11 10:33:09 [INFO] 
> 2017-12-11 10:33:09 [INFO] 
> ---
> 2017-12-11 10:33:09 [INFO]  T E S T S
> 2017-12-11 10:33:09 [INFO] 
> ---
> 2017-12-11 10:33:09 [INFO] Running 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
> 2017-12-11 10:37:32 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 262.641 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
> 2017-12-11 10:37:32 [ERROR] 
> testFencingStress(org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication)
>   Time elapsed: 262.477 s  <<< ERROR!
> 2017-12-11 10:37:32 java.lang.RuntimeException: Deferred
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.test.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:130)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.test.MultithreadedTestUtil$TestContext.stop(MultithreadedTestUtil.java:166)
> 2017-12-11 10:37:32   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress(TestDNFencingWithReplication.java:137)
> 2017-12-11 10:37:32   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> 2017-12-11 10:37:32   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2017-12-11 10:37:32   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2017-12-11 10:37:32   at java.lang.reflect.Method.invoke(Method.java:498)
> 2017-12-11 10:37:32   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 2017-12-11 10:37:32   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2017-12-11 10:37:32   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 2017-12-11 10:37:32   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> 2017-12-11 10:37:32   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> 2017-12-11 10:37:32   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> 2017-12-11 10:37:32   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
> 2017-12-11 10:37:32   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)
> 2017-12-11 10:37:32 Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
> 2017-12-11 10:37:32   at 
> 

[jira] [Commented] (HDFS-12891) Do not invalidate blocks if toInvalidate is empty

2017-12-12 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287773#comment-16287773
 ] 

Sean Mackrory commented on HDFS-12891:
--

I see I'm late to the party, but I started testing this the other day and can 
confirm I was seeing approximately 1 in 30 runs fail and that this fixes it. +1 
to the patch as committed.

> Do not invalidate blocks if toInvalidate is empty
> -
>
> Key: HDFS-12891
> URL: https://issues.apache.org/jira/browse/HDFS-12891
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>  Labels: flaky-test
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-12891.01.patch, HDFS-12891.02.patch
>
>
> {code:java}
> java.lang.AssertionError: Test resulted in an unexpected exit
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress(TestDNFencingWithReplication.java:147)
> :
> :
> 2017-10-19 21:39:40,068 [main] INFO  hdfs.MiniDFSCluster 
> (MiniDFSCluster.java:shutdown(1965)) - Shutting down the Mini HDFS Cluster
> 2017-10-19 21:39:40,068 [main] FATAL hdfs.MiniDFSCluster 
> (MiniDFSCluster.java:shutdown(1968)) - Test resulted in an unexpected exit
> 1: java.lang.AssertionError
>   at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:265)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4437)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.AssertionError
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor.addBlocksToBeInvalidated(DatanodeDescriptor.java:641)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.InvalidateBlocks.invalidateWork(InvalidateBlocks.java:299)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.invalidateWorkForOneNode(BlockManager.java:4246)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeInvalidateWork(BlockManager.java:1736)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4561)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4418)
>   ... 1 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12892) TestClusterTopology#testChooseRandom fails intermittently

2017-12-05 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory reassigned HDFS-12892:


Assignee: Zsolt Venczel

> TestClusterTopology#testChooseRandom fails intermittently
> -
>
> Key: HDFS-12892
> URL: https://issues.apache.org/jira/browse/HDFS-12892
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>  Labels: flaky-test
>
> Flaky test failure:
> {code:java}
> java.lang.AssertionError
> Error
> Not choosing nodes randomly
> Stack Trace
> java.lang.AssertionError: Not choosing nodes randomly
> at 
> org.apache.hadoop.net.TestClusterTopology.testChooseRandom(TestClusterTopology.java:170)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby

2017-12-05 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278900#comment-16278900
 ] 

Sean Mackrory commented on HDFS-10702:
--

Any other comments on the consistency issue, [~mingma] / [~zhz]? I'd like to 
finish this up soon if there aren't any other concerns - the configuration idea 
in my last comment can be addressed later without causing any problem - I'd 
rather just keep this patch small until we can come to consensus on what's 
already in it.

> Add a Client API and Proxy Provider to enable stale read from Standby
> -
>
> Key: HDFS-10702
> URL: https://issues.apache.org/jira/browse/HDFS-10702
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jiayi Zhou
>Assignee: Sean Mackrory
>Priority: Minor
> Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, 
> HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, 
> HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, 
> StaleReadfromStandbyNN.pdf
>
>
> Currently, clients must always talk to the active NameNode when performing 
> any metadata operation, which means active NameNode could be a bottleneck for 
> scalability. One way to solve this problem is to send read-only operations to 
> Standby NameNode. The disadvantage is that it might be a stale read. 
> Here, I'm thinking of adding a Client API to enable/disable stale read from 
> Standby which gives Client the power to set the staleness restriction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-11-09 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246667#comment-16246667
 ] 

Sean Mackrory commented on HDFS-11096:
--

Note that the fix to the sbin/start-yarn.sh shell script has now been committed 
spearately as YARN-7465. I'm working with [~rchiang] to figure out why YARN is 
failing after the rolling upgrade - it works in both Hadoop 2 and Hadoop 3 in 
the clusters used for the distcp test - it's only after the upgrade...

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, 
> HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch, 
> HDFS-11096.006.patch, HDFS-11096.007.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-11-01 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234197#comment-16234197
 ] 

Sean Mackrory commented on HDFS-11096:
--

>From an HDFS standpoint, definitely - I've run many successful rolling upgrade 
>and distcp-over-webhdfs tests this week and updated the patch. The only thing 
>remaining is to get automation itself in place after this is committed.

I looked into the YARN issues. I'm still seeing very similar symptoms to the 
YARN-6457 issue mentioned above in both branch-3.0 and trunk. In trunk I'm also 
seeing this:

{quote}
17/10/31 23:05:49 INFO security.AMRMTokenSecretManager: Creating password for 
appattempt_1509490231144_0628_02
17/10/31 23:05:49 INFO amlauncher.AMLauncher: Error launching 
appattempt_1509490231144_0628_02. Got exception: 
org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid container 
token used for starting container on : container-5.docker:35151
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.verifyAndGetContainerTokenIdentifier(ContainerManagerImpl.java:974)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:789)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagementProtocolPBServiceImpl.startContainers(ContainerManagementProtocolPBServiceImpl.java:70)
at 
org.apache.hadoop.yarn.proto.ContainerManagementProtocol$ContainerManagementProtocolService$2.callBlockingMethod(ContainerManagementProtocol.java:127)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455)

at sun.reflect.GeneratedConstructorAccessor70.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:131)
at sun.reflect.GeneratedMethodAccessor85.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy89.startContainers(Unknown Source)
at 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:123)
at 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:304)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 Invalid container token used for starting container on : 
container-5.docker:35151
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.verifyAndGetContainerTokenIdentifier(ContainerManagerImpl.java:974)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:789)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagementProtocolPBServiceImpl.startContainers(ContainerManagementProtocolPBServiceImpl.java:70)
at 

[jira] [Comment Edited] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-11-01 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234197#comment-16234197
 ] 

Sean Mackrory edited comment on HDFS-11096 at 11/1/17 3:16 PM:
---

>From an HDFS standpoint, definitely - I've run many successful rolling upgrade 
>and distcp-over-webhdfs tests this week and updated the patch. The only thing 
>remaining is to get automation itself in place after this is committed.

I looked into the YARN issues. I'm still seeing very similar symptoms to the 
YARN-6457 issue mentioned above in both branch-3.0 and trunk. In trunk I'm also 
seeing this:

{code}
17/10/31 23:05:49 INFO security.AMRMTokenSecretManager: Creating password for 
appattempt_1509490231144_0628_02
17/10/31 23:05:49 INFO amlauncher.AMLauncher: Error launching 
appattempt_1509490231144_0628_02. Got exception: 
org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid container 
token used for starting container on : container-5.docker:35151
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.verifyAndGetContainerTokenIdentifier(ContainerManagerImpl.java:974)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:789)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagementProtocolPBServiceImpl.startContainers(ContainerManagementProtocolPBServiceImpl.java:70)
at 
org.apache.hadoop.yarn.proto.ContainerManagementProtocol$ContainerManagementProtocolService$2.callBlockingMethod(ContainerManagementProtocol.java:127)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455)

at sun.reflect.GeneratedConstructorAccessor70.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:131)
at sun.reflect.GeneratedMethodAccessor85.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy89.startContainers(Unknown Source)
at 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:123)
at 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:304)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 Invalid container token used for starting container on : 
container-5.docker:35151
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.verifyAndGetContainerTokenIdentifier(ContainerManagerImpl.java:974)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:789)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagementProtocolPBServiceImpl.startContainers(ContainerManagementProtocolPBServiceImpl.java:70)
at 

[jira] [Commented] (HDFS-206) Support for head in FSShell

2017-10-31 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226903#comment-16226903
 ] 

Sean Mackrory commented on HDFS-206:


Committed to trunk. Did you want to get this into any other versions too?

> Support for head in FSShell
> ---
>
> Key: HDFS-206
> URL: https://issues.apache.org/jira/browse/HDFS-206
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Olga Natkovich
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Fix For: 3.1.0
>
> Attachments: HDFS-206.001.patch, HDFS-206.002.patch, 
> HDFS-206.003.patch
>
>
> For Pig project, we would like to integrate head and tail commands into our 
> shell (Grunt). I could find tail but not head command



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-206) Support for head in FSShell

2017-10-31 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-206:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Support for head in FSShell
> ---
>
> Key: HDFS-206
> URL: https://issues.apache.org/jira/browse/HDFS-206
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Olga Natkovich
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Fix For: 3.1.0
>
> Attachments: HDFS-206.001.patch, HDFS-206.002.patch, 
> HDFS-206.003.patch
>
>
> For Pig project, we would like to integrate head and tail commands into our 
> shell (Grunt). I could find tail but not head command



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-206) Support for head in FSShell

2017-10-31 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-206:
---
Fix Version/s: 3.1.0

> Support for head in FSShell
> ---
>
> Key: HDFS-206
> URL: https://issues.apache.org/jira/browse/HDFS-206
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Olga Natkovich
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Fix For: 3.1.0
>
> Attachments: HDFS-206.001.patch, HDFS-206.002.patch, 
> HDFS-206.003.patch
>
>
> For Pig project, we would like to integrate head and tail commands into our 
> shell (Grunt). I could find tail but not head command



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-206) Support for head in FSShell

2017-10-31 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226885#comment-16226885
 ] 

Sean Mackrory commented on HDFS-206:


Thanks for resubmitting - I tried manually running the Jenkins job last night 
and it failed because some processes were killed...

The test failures do look very unrelated - +1, committing...

> Support for head in FSShell
> ---
>
> Key: HDFS-206
> URL: https://issues.apache.org/jira/browse/HDFS-206
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Olga Natkovich
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-206.001.patch, HDFS-206.002.patch, 
> HDFS-206.003.patch
>
>
> For Pig project, we would like to integrate head and tail commands into our 
> shell (Grunt). I could find tail but not head command



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7878) API - expose an unique file identifier

2017-10-30 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225910#comment-16225910
 ] 

Sean Mackrory commented on HDFS-7878:
-

No other feedback from me - +1 on the bits I understand, like Steve :)

> API - expose an unique file identifier
> --
>
> Key: HDFS-7878
> URL: https://issues.apache.org/jira/browse/HDFS-7878
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, 
> HDFS-7878.03.patch, HDFS-7878.04.patch, HDFS-7878.05.patch, 
> HDFS-7878.06.patch, HDFS-7878.07.patch, HDFS-7878.08.patch, 
> HDFS-7878.09.patch, HDFS-7878.10.patch, HDFS-7878.11.patch, 
> HDFS-7878.12.patch, HDFS-7878.13.patch, HDFS-7878.14.patch, 
> HDFS-7878.15.patch, HDFS-7878.16.patch, HDFS-7878.17.patch, 
> HDFS-7878.18.patch, HDFS-7878.19.patch, HDFS-7878.20.patch, 
> HDFS-7878.21.patch, HDFS-7878.patch
>
>
> See HDFS-487.
> Even though that is resolved as duplicate, the ID is actually not exposed by 
> the JIRA it supposedly duplicates.
> INode ID for the file should be easy to expose; alternatively ID could be 
> derived from block IDs, to account for appends...
> This is useful e.g. for cache key by file, to make sure cache stays correct 
> when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-206) Support for head in FSShell

2017-10-30 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225886#comment-16225886
 ] 

Sean Mackrory commented on HDFS-206:


Yetus never reviewed your 3rd patch, and Jenkins now appears to be getting 
restarted. I'll check back tomorrow and follow-up if Yetus doesn't review it 
overnight.

> Support for head in FSShell
> ---
>
> Key: HDFS-206
> URL: https://issues.apache.org/jira/browse/HDFS-206
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Olga Natkovich
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-206.001.patch, HDFS-206.002.patch, 
> HDFS-206.003.patch
>
>
> For Pig project, we would like to integrate head and tail commands into our 
> shell (Grunt). I could find tail but not head command



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-206) Support for head in FSShell

2017-10-30 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225278#comment-16225278
 ] 

Sean Mackrory commented on HDFS-206:


Ah - you know what? I didn't see that the last argument passed to copyBytes was 
to close the file after the copy. Still, I like the idea of having a resource 
closed at the end of the same function it was created in so it's easy to see 
its lifecycle, so I like the .003 patch. +1 - I'll commit your .003. patch 
later today unless anyone else weighs in.

> Support for head in FSShell
> ---
>
> Key: HDFS-206
> URL: https://issues.apache.org/jira/browse/HDFS-206
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Olga Natkovich
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-206.001.patch, HDFS-206.002.patch, 
> HDFS-206.003.patch
>
>
> For Pig project, we would like to integrate head and tail commands into our 
> shell (Grunt). I could find tail but not head command



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-10-30 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225110#comment-16225110
 ] 

Sean Mackrory commented on HDFS-11096:
--

Re: test4tests, the whole patch is tests. Should figure out how to tell Yetus 
that these are a kind of tests at some point.

Re: asflicense, those files are not involved with my patch.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, 
> HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch, 
> HDFS-11096.006.patch, HDFS-11096.007.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-206) Support for head in FSShell

2017-10-30 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225096#comment-16225096
 ] 

Sean Mackrory edited comment on HDFS-206 at 10/30/17 3:11 PM:
--

This looks pretty good to me. A couple of nitpicks:
* You're still documenting the -f option for head in USAGE, although none 
exists.
* We should have a finally \{\} to close the file. Unlikely to ever cause a 
problem in practice here, but good practice and easy enough to fix right now.

It's been too long for me to see the last test results, but we'll check again 
when you upload the next patch. Unless it was TestDFSShell failing I think it's 
extremely unlikely to be your patch that broke something.


was (Author: mackrorysd):
This looks pretty good to me. A couple of nitpicks:
* You're still documenting the -f option for head in USAGE, although none 
exists.
* We should have a finally \{\} to close the file. Unlikely to ever cause a 
problem in practice here, but good practice and easy enough to fix right now.
It's been too long for me to see the last test results, but we'll check again 
when you upload the next patch. Unless it was TestDFSShell failing I think it's 
extremely unlikely to be your patch that broke something.

> Support for head in FSShell
> ---
>
> Key: HDFS-206
> URL: https://issues.apache.org/jira/browse/HDFS-206
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Olga Natkovich
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-206.001.patch, HDFS-206.002.patch
>
>
> For Pig project, we would like to integrate head and tail commands into our 
> shell (Grunt). I could find tail but not head command



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-206) Support for head in FSShell

2017-10-30 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225096#comment-16225096
 ] 

Sean Mackrory commented on HDFS-206:


This looks pretty good to me. A couple of nitpicks:
* You're still documenting the -f option for head in USAGE, although none 
exists.
* We should have a finally \{\} to close the file. Unlikely to ever cause a 
problem in practice here, but good practice and easy enough to fix right now.
It's been too long for me to see the last test results, but we'll check again 
when you upload the next patch. Unless it was TestDFSShell failing I think it's 
extremely unlikely to be your patch that broke something.

> Support for head in FSShell
> ---
>
> Key: HDFS-206
> URL: https://issues.apache.org/jira/browse/HDFS-206
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Olga Natkovich
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-206.001.patch, HDFS-206.002.patch
>
>
> For Pig project, we would like to integrate head and tail commands into our 
> shell (Grunt). I could find tail but not head command



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-10-27 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-11096:
-
Attachment: HDFS-11096.007.patch

Okay - I'm now pretty happy with how this is working. I saw the last shellcheck 
problems locally, and have fixed those, too. I've had several successful test 
runs of both of the Docker tests in the last few days, and this is looking 
pretty reliable to me:

* Versions to test are now specified via CLI args to the Docker scripts. That 
way this only has to change in code when there's a bug to fix or other 
improvement to make: Jenkins jobs can be updated for various version 
combinations independently.
* Fixing more ZK timeouts, this time in YARN. I've disabled the YARN rolling 
upgrade as that appears to be troublesome again. But the HDFS upgrade is 
working and YARN / MR is working well during and after that upgrade. I'll keep 
troubleshooting the YARN side, but that can be a separate JIRA.
* Logs are now saved to ./logs/ back on the host to facilitate more debugging 
after the Docker images have been destroyed in the event of a failure.

Although I've made a number of fixes as documented in the comments, not much 
has changed that would invalidate the value of previous code reviews, IMO. 
[~aw] - have I addressed the issues you pointed out to your satisfaction?

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, 
> HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch, 
> HDFS-11096.006.patch, HDFS-11096.007.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7878) API - expose an unique file identifier

2017-10-26 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220898#comment-16220898
 ] 

Sean Mackrory commented on HDFS-7878:
-

{quote}Wire compatibility with HDFS{quote}
How do commented out lines ensure wire compatibility? It would make sense if 
these were obsolete fields and we didn't want to reuse obsolete number in case 
older messages get misinterpreted, but then we should be reusing. Nevertheless, 
it appears we're not doing that in the latest patch anymore.

I do think this resolves my previous concerns with the patch. In 
testCrossSerializationProto and testJavaSerialization we're removing assertions 
that the PathHandle to what should be the same file should be identical. Isn't 
that still true, and should be?

> API - expose an unique file identifier
> --
>
> Key: HDFS-7878
> URL: https://issues.apache.org/jira/browse/HDFS-7878
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, 
> HDFS-7878.03.patch, HDFS-7878.04.patch, HDFS-7878.05.patch, 
> HDFS-7878.06.patch, HDFS-7878.07.patch, HDFS-7878.08.patch, 
> HDFS-7878.09.patch, HDFS-7878.10.patch, HDFS-7878.11.patch, 
> HDFS-7878.12.patch, HDFS-7878.13.patch, HDFS-7878.14.patch, 
> HDFS-7878.15.patch, HDFS-7878.16.patch, HDFS-7878.17.patch, 
> HDFS-7878.18.patch, HDFS-7878.patch
>
>
> See HDFS-487.
> Even though that is resolved as duplicate, the ID is actually not exposed by 
> the JIRA it supposedly duplicates.
> INode ID for the file should be easy to expose; alternatively ID could be 
> derived from block IDs, to account for appends...
> This is useful e.g. for cache key by file, to make sure cache stays correct 
> when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-10-20 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-11096:
-
Attachment: HDFS-11096.006.patch

It's possible, but will be tough.

I worked with [~rchiang] to get past the YARN issues I was having. By 
specifying both hostname (required by shell scripts) and the address (hostname 
+ ports) for all of the YARN ports, I was able to get it to work. I feel this 
is possibly an incompatible change in YARN, being that YARN works fine by just 
specifying the hostname (as long as everything's going to use the default 
ports) in Hadoop 2.x, but I'll leave that [~rchiang]'s judgement if there's a 
good enough reason and we can put some documentation in place. Specifying the 
ports in a Hadoop 2.x cluster prior to upgrade wouldn't be too bad.

I then repeatedly encountered a lot of failures due to timeouts with both 
ZooKeeper and JournalNodes. I increased a couple of timeouts and was able to 
get it working reliably again. Other changes in the revision I'm posting (.006) 
right now:

* where it applies to both YARN and HDFS, I've stopped used NAMENODES and 
DATANODES, but MASTERS and WORKERS
* I fixed the sole shellcheck issue above. It was not raised locally, so my 
version must be out of sync. Can't confirm until Yetus does that I've 
eliminated others
* I've added more distcp-over-webhdfs tests: to, from, and on both old and new 
clusters.They're all working perfeclt.
 
Currently the only issue I see is that the ResourceManager port 8032 stops 
listening towards the end of the rolling upgrade test. ResourceManager does not 
log any problems, and I don't see any other issues. But after we stop all the 
loops of MapReduce jobs that were running during the rolling upgrade, we can't 
query the job history to confirm they were all successful, because it can't 
connect to :8032 on either node. Other ResourceManager services are still 
listening. This happens even if I comment out the YARN rolling upgrade step.

I may need to get some more help from [~rchiang] debugging that again. I'm also 
going to try running this against branch-3.0 instead of trunk, to eliminate 
some instability I may be seeing.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, 
> HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch, 
> HDFS-11096.006.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-10-11 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-11096:
-
Attachment: HDFS-11096.005.patch

* Restored the old hadoop_actual_ssh. I did try a few cases of multi-token 
commands through workers.sh and they didn't behave the same with my function if 
you didn't quote the command or something. I'm trying to think of a better way 
to differentiate between that function and my own that just takes quoted, 
multi-line strings and therefore shouldn't be escaping the same way. They're 
currently named hadoop_ssh_cmd (old one - expects a single command passed all 
the way from workers.sh) and hadoop_ssh_cmds (new one - expects a quote script 
passed directly from another script). But I'm not happy with that either.
* Fixed a bunch of issues when running as a non-root user. Some of these were 
in the pull-over-http test, but one of them was actually in start-yarn.sh where 
an arg is missing in a non-root branch of the script.
* Also fixed multi-cluster config in the pull-over-http.sh. They ended up using 
the same YARN cluster ID when talking to the ZooKeeper cluster.
* Added license headers to several of the scripts, as the build started failing 
without them.

I am now again blocked on what may be a bug in YARN or MR. The MapReduce 
classpath seems all messed up in both tests. It wants me to specify 
HADOOP_MAPRED_CLASSPATH in mapred-site.xml (even though the inferred value is 
correct), and even then it can't find log4j.properties or yarn-site.xml.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, 
> HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12595) Isolation of native libraries in JARs

2017-10-05 Thread Sean Mackrory (JIRA)
Sean Mackrory created HDFS-12595:


 Summary: Isolation of native libraries in JARs
 Key: HDFS-12595
 URL: https://issues.apache.org/jira/browse/HDFS-12595
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Sean Mackrory
Assignee: Sean Mackrory


There is a native library embedded in the Netty JAR. Even with shading, this 
can cause conflicts if a user application is used a different version of Netty. 
Hadoop does not use the native implementations, so we could just remove it, or 
we could relocate it more intelligently.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-10-02 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188872#comment-16188872
 ] 

Sean Mackrory commented on HDFS-11096:
--

Thanks, [~atm]! I have also not forgotten the comment about the SSH function 
likely breaking the existing usage. I'll check into that...

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, 
> HDFS-11096.003.patch, HDFS-11096.004.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-09-29 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-11096:
-
Attachment: HDFS-11096.004.patch

.004 patch:

* Added more helpful logging so set -v and set -x aren't needed, and removed 
those.
* Changed to run as non-root - Hadoop environment was actually not set up 
because it was root, so now I can actually use create-release. This involved 
quite a big change to the Docker wrapper scripts to move EVERYTHING that 
require root privileges into separate steps before kicking off the actual test 
run that no longer used or needed root at all.
* Locally at least, I have now fixed *all* shellcheck issues.

I actually feel pretty strongly about keeping set -e here. I usually like to 
use "if ...; then" for handling likely errors, but I do see your point about 
not having a way to have responses conditional on the specific error code. But 
in this case every step is required and there's not a ton we can do to recover 
from a failed command that wouldn't require fixes to the tests or the Hadoop 
from a human. The alternative is to have test runs advance far beyond the root 
cause of failure and be harder to troubleshoot, or wrap a lot of unnecessary 
checks for success around (quite literally) everything. Thoughts?

Currently YARN rolling upgrades are actually failing because the Hadoop 3 
ResourceManager is using URIs like http://ns1:8020/... (logical namespace name 
used as though it's a hostname). Not sure where that's coming from yet, I need 
to dig there, but I doubt the fix there will invalidate much review of the 
current code.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, 
> HDFS-11096.003.patch, HDFS-11096.004.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7878) API - expose an unique file identifier

2017-09-26 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181721#comment-16181721
 ] 

Sean Mackrory commented on HDFS-7878:
-

{quote}Neither API has been part of a release, yet. {quote}

But it is included in a branch that's being prepared for a release. Unless we 
can still squeeze this in to beta-1 (and we may indeed be a bit late for that - 
[~andrew.wang]?), I would assume this is headed for 3.1 or maybe 3.0 - and 
either way it would need to keep a constructor with the same prototype as what 
currently exists in branch-3.0. Right?

On an unrelated note, I second your thoughts that it should be open(FileHandle) 
and not open(FileStatus). But other than that, I'd +1 the rest of the patch.

> API - expose an unique file identifier
> --
>
> Key: HDFS-7878
> URL: https://issues.apache.org/jira/browse/HDFS-7878
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, 
> HDFS-7878.03.patch, HDFS-7878.04.patch, HDFS-7878.05.patch, 
> HDFS-7878.06.patch, HDFS-7878.07.patch, HDFS-7878.08.patch, 
> HDFS-7878.09.patch, HDFS-7878.10.patch, HDFS-7878.11.patch, 
> HDFS-7878.12.patch, HDFS-7878.patch
>
>
> See HDFS-487.
> Even though that is resolved as duplicate, the ID is actually not exposed by 
> the JIRA it supposedly duplicates.
> INode ID for the file should be easy to expose; alternatively ID could be 
> derived from block IDs, to account for appends...
> This is useful e.g. for cache key by file, to make sure cache stays correct 
> when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby

2017-09-26 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181648#comment-16181648
 ] 

Sean Mackrory commented on HDFS-10702:
--

I was just chatting with [~yzhangal] about this feature and he had a cool 
suggestion: a configuration that eliminates any need for code changes in an 
application to use this. I see 2 options here:
- a configuration that causes the minimum transaction ID to be set at 0 by 
default (i.e. just trust that the standby NN's are already sufficiently up to 
date)
- a configuration that triggers the HDFS client retrieving the latest 
transaction ID and setting it as the minimum, so that results are at least as 
fresh as the client itself.

> Add a Client API and Proxy Provider to enable stale read from Standby
> -
>
> Key: HDFS-10702
> URL: https://issues.apache.org/jira/browse/HDFS-10702
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jiayi Zhou
>Assignee: Sean Mackrory
>Priority: Minor
> Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, 
> HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, 
> HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, 
> StaleReadfromStandbyNN.pdf
>
>
> Currently, clients must always talk to the active NameNode when performing 
> any metadata operation, which means active NameNode could be a bottleneck for 
> scalability. One way to solve this problem is to send read-only operations to 
> Standby NameNode. The disadvantage is that it might be a stale read. 
> Here, I'm thinking of adding a Client API to enable/disable stale read from 
> Standby which gives Client the power to set the staleness restriction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-7878) API - expose an unique file identifier

2017-09-26 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181634#comment-16181634
 ] 

Sean Mackrory edited comment on HDFS-7878 at 9/26/17 9:41 PM:
--

Looks like this patch is violating some of compatibility guarantees. The 
following is already annotated as Public and Stable in branch-3.0, but is 
removed in this patch:
{code}
public FileStatus(long, boolean, int, long, long, long, FsPermission, String, 
String, Path, Path, boolean, boolean, boolean)
{code}
Can we make sure a function with that prototype is added back? 
LocatedFileStatus is a similar situation, although it's "Evolving":
{code}
public LocatedFileStatus(long, boolean, int, long, long, long, FsPermission, 
String, String, Path, Path, boolean, boolean, boolean, BlockLocation[])
{code}
Could someone also enlighten me as to the purpose of the commented out lines in 
FSProtos.proto? I thought it was odd that we were replacing "alias = 13", but I 
have no idea why that line is there in the first place.


was (Author: mackrorysd):
Looks like this patch is violating some of compatibility guarantees. The 
following is already annotated as Public and Stable in branch-3.0, but is 
removed in this patch:
{code}
public FileStatus(long, boolean, int, long, long, long, FsPermission, String, 
String, Path, Path, boolean, boolean, boolean)
{code}
Can we make sure a function with that prototype is added back? 
LocatedFileStatus is a similar situation, although it's "Evolving":
{code}
public LocatedFileStatus(long length, boolean isdir, int, long, long, long, 
FsPermission, String, String, Path, Path, boolean, boolean, boolean, 
BlockLocation[] locations)
{code}
Could someone also enlighten me as to the purpose of the commented out lines in 
FSProtos.proto? I thought it was odd that we were replacing "alias = 13", but I 
have no idea why that line is there in the first place.

> API - expose an unique file identifier
> --
>
> Key: HDFS-7878
> URL: https://issues.apache.org/jira/browse/HDFS-7878
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, 
> HDFS-7878.03.patch, HDFS-7878.04.patch, HDFS-7878.05.patch, 
> HDFS-7878.06.patch, HDFS-7878.07.patch, HDFS-7878.08.patch, 
> HDFS-7878.09.patch, HDFS-7878.10.patch, HDFS-7878.11.patch, 
> HDFS-7878.12.patch, HDFS-7878.patch
>
>
> See HDFS-487.
> Even though that is resolved as duplicate, the ID is actually not exposed by 
> the JIRA it supposedly duplicates.
> INode ID for the file should be easy to expose; alternatively ID could be 
> derived from block IDs, to account for appends...
> This is useful e.g. for cache key by file, to make sure cache stays correct 
> when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7878) API - expose an unique file identifier

2017-09-26 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181634#comment-16181634
 ] 

Sean Mackrory commented on HDFS-7878:
-

Looks like this patch is violating some of compatibility guarantees. The 
following is already annotated as Public and Stable in branch-3.0, but is 
removed in this patch:
{code}
public FileStatus(long, boolean, int, long, long, long, FsPermission, String, 
String, Path, Path, boolean, boolean, boolean)
{code}
Can we make sure a function with that prototype is added back? 
LocatedFileStatus is a similar situation, although it's "Evolving":
{code}
public LocatedFileStatus(long length, boolean isdir, int, long, long, long, 
FsPermission, String, String, Path, Path, boolean, boolean, boolean, 
BlockLocation[] locations)
{code}
Could someone also enlighten me as to the purpose of the commented out lines in 
FSProtos.proto? I thought it was odd that we were replacing "alias = 13", but I 
have no idea why that line is there in the first place.

> API - expose an unique file identifier
> --
>
> Key: HDFS-7878
> URL: https://issues.apache.org/jira/browse/HDFS-7878
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, 
> HDFS-7878.03.patch, HDFS-7878.04.patch, HDFS-7878.05.patch, 
> HDFS-7878.06.patch, HDFS-7878.07.patch, HDFS-7878.08.patch, 
> HDFS-7878.09.patch, HDFS-7878.10.patch, HDFS-7878.11.patch, 
> HDFS-7878.12.patch, HDFS-7878.patch
>
>
> See HDFS-487.
> Even though that is resolved as duplicate, the ID is actually not exposed by 
> the JIRA it supposedly duplicates.
> INode ID for the file should be easy to expose; alternatively ID could be 
> derived from block IDs, to account for appends...
> This is useful e.g. for cache key by file, to make sure cache stays correct 
> when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby

2017-09-19 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16171883#comment-16171883
 ] 

Sean Mackrory edited comment on HDFS-10702 at 9/19/17 3:38 PM:
---

The assumption of this feature is that an application is responsible for 
knowing when a dataset is stable enough to work on, and that events that happen 
after that transaction ID may affect the accuracy of the results as seen by the 
application. There are obviously cases where it isn't reasonable for an 
application to make an assumption like that, but like I said above, this isn't 
intended for every situation. That said, I'd be all for testing the sequence 
you described to verify exactly how it fails and that it doesn't bring all of 
HDFS down with it - just the client. But if a file is deleted after the 
specified transaction ID and the application tries to access it, returning an 
exception would be the correct behavior, IMO.

I was actually wondering if what you meant was the block locations were out of 
date because the file had been re-replicated in a different configuration due 
to cluster health issues, or decommissioning. Cluster state is distinct from an 
application knowing when it's safe to assume that a dataset is finalized, so 
that complicates the assumption somewhat.

But if it's just a clearly stated assumption that this feature transfers 
responsibility for knowing that a dataset is complete to the client application 
and we test the accessing a deleted file fails in a correct manner, would that 
address your concerns, [~mingma]?


was (Author: mackrorysd):
The assumption of this feature is that an application is responsible for 
knowing when a dataset is stable enough to work on, and that any failures or 
inaccuracies resulting in stuff that happens after the minimum transaction ID 
is assumed by the application. There are obviously case where that's not 
reasonable, but like I said above, this isn't intended for every situation. 
That said, I'd be all for testing the sequence you described to verify exactly 
how it fails and that it doesn't bring all of HDFS down with it - just the 
client. But if a file is deleted after the specified transaction ID and the 
application tries to access it, returning an exception would be the correct 
behavior, IMO.

I was actually wondering if what you meant was the block locations were out of 
date because the file had been re-replicated in a different configuration due 
to cluster health issues, or decommissioning. Cluster state is distinct from an 
application knowing when it's safe to assume that a dataset is finalized, so 
that complicates the assumption somewhat.

But if it's just a clearly stated assumption that this feature transfers 
responsibility for knowing that a dataset is complete to the client application 
and we test the accessing a deleted file fails in a correct manner, would that 
address your concerns, [~mingma]?

> Add a Client API and Proxy Provider to enable stale read from Standby
> -
>
> Key: HDFS-10702
> URL: https://issues.apache.org/jira/browse/HDFS-10702
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jiayi Zhou
>Assignee: Sean Mackrory
>Priority: Minor
> Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, 
> HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, 
> HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, 
> StaleReadfromStandbyNN.pdf
>
>
> Currently, clients must always talk to the active NameNode when performing 
> any metadata operation, which means active NameNode could be a bottleneck for 
> scalability. One way to solve this problem is to send read-only operations to 
> Standby NameNode. The disadvantage is that it might be a stale read. 
> Here, I'm thinking of adding a Client API to enable/disable stale read from 
> Standby which gives Client the power to set the staleness restriction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby

2017-09-19 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16171883#comment-16171883
 ] 

Sean Mackrory edited comment on HDFS-10702 at 9/19/17 3:36 PM:
---

The assumption of this feature is that an application is responsible for 
knowing when a dataset is stable enough to work on, and that any failures or 
inaccuracies resulting in stuff that happens after the minimum transaction ID 
is assumed by the application. There are obviously case where that's not 
reasonable, but like I said above, this isn't intended for every situation. 
That said, I'd be all for testing the sequence you described to verify exactly 
how it fails and that it doesn't bring all of HDFS down with it - just the 
client. But if a file is deleted after the specified transaction ID and the 
application tries to access it, returning an exception would be the correct 
behavior, IMO.

I was actually wondering if what you meant was the block locations were out of 
date because the file had been re-replicated in a different configuration due 
to cluster health issues, or decommissioning. Cluster state is distinct from an 
application knowing when it's safe to assume that a dataset is finalized, so 
that complicates the assumption somewhat.

But if it's just a clearly stated assumption that this feature transfers 
responsibility for knowing that a dataset is complete to the client application 
and we test the accessing a deleted file fails in a correct manner, would that 
address your concerns, [~mingma]?


was (Author: mackrorysd):
The assumption of this feature is that an application is responsible for 
knowing when a dataset is stable enough to work on, and that any failures or 
inaccuracies resulting in stuff that happens after the minimum transaction ID 
is assumed by the application. That said, I'd be all for testing the scenario 
above to verify exactly how it fails and that it doesn't bring all of HDFS down 
with it - just the client. But if file is deleted after the specified 
transaction and the application tries to access it, returning an exception 
would be the correct behavior.

I was actually wondering if what you meant was the block locations were out of 
date because the file had been re-replicated in a different configuration due 
to cluster health issues, or decommissioning. Cluster state is distinct from an 
application knowing when it's safe to assume that a dataset is finalized, so 
that complicates the assumption somewhat.

But if it's just a clearly stated assumption that this feature transfers 
reponsibility for knowing that a dataset is complete to the client application 
and we test the accessing a deleted file fails in a correct manner, would that 
address your concerns, [~mingma]?

> Add a Client API and Proxy Provider to enable stale read from Standby
> -
>
> Key: HDFS-10702
> URL: https://issues.apache.org/jira/browse/HDFS-10702
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jiayi Zhou
>Assignee: Sean Mackrory
>Priority: Minor
> Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, 
> HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, 
> HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, 
> StaleReadfromStandbyNN.pdf
>
>
> Currently, clients must always talk to the active NameNode when performing 
> any metadata operation, which means active NameNode could be a bottleneck for 
> scalability. One way to solve this problem is to send read-only operations to 
> Standby NameNode. The disadvantage is that it might be a stale read. 
> Here, I'm thinking of adding a Client API to enable/disable stale read from 
> Standby which gives Client the power to set the staleness restriction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby

2017-09-19 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16171883#comment-16171883
 ] 

Sean Mackrory commented on HDFS-10702:
--

The assumption of this feature is that an application is responsible for 
knowing when a dataset is stable enough to work on, and that any failures or 
inaccuracies resulting in stuff that happens after the minimum transaction ID 
is assumed by the application. That said, I'd be all for testing the scenario 
above to verify exactly how it fails and that it doesn't bring all of HDFS down 
with it - just the client. But if file is deleted after the specified 
transaction and the application tries to access it, returning an exception 
would be the correct behavior.

I was actually wondering if what you meant was the block locations were out of 
date because the file had been re-replicated in a different configuration due 
to cluster health issues, or decommissioning. Cluster state is distinct from an 
application knowing when it's safe to assume that a dataset is finalized, so 
that complicates the assumption somewhat.

But if it's just a clearly stated assumption that this feature transfers 
reponsibility for knowing that a dataset is complete to the client application 
and we test the accessing a deleted file fails in a correct manner, would that 
address your concerns, [~mingma]?

> Add a Client API and Proxy Provider to enable stale read from Standby
> -
>
> Key: HDFS-10702
> URL: https://issues.apache.org/jira/browse/HDFS-10702
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jiayi Zhou
>Assignee: Sean Mackrory
>Priority: Minor
> Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, 
> HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, 
> HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, 
> StaleReadfromStandbyNN.pdf
>
>
> Currently, clients must always talk to the active NameNode when performing 
> any metadata operation, which means active NameNode could be a bottleneck for 
> scalability. One way to solve this problem is to send read-only operations to 
> Standby NameNode. The disadvantage is that it might be a stale read. 
> Here, I'm thinking of adding a Client API to enable/disable stale read from 
> Standby which gives Client the power to set the staleness restriction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-09-15 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-11096:
-
Attachment: HDFS-11096.003.patch

So attaching a 3rd patch. YARN appears to be failing, so I need to debug this, 
but this should includes a lot of changes based on the feedback here and I 
doubt the fix will be significant. Most notably:
* Augmented documentation to clear up any confusion about how to run the tests 
and what their main components are.
* Removed any use of deprecated commands, and switched to 3.1.0-SNAPSHOT
* Removed set -x. [~aw] - how do you feel about set -v? And did you mean set 
+e? I feel like if it's okay for a command to sometimes fail, you can deal with 
that return code explicitly, otherwise I'd like that failure to bubble up. Am I 
missing something?
* Switched to using hadoop-functions and added what I needed there. Not sure I 
understand exactly what hadoop_actual_ssh was supposed to be doing before, but 
it's not used elsewhere and is marked as private, so I hope my change to it is 
okay. I redid the join / split functions to make shellcheck much happier (and 
I'm also much happier with the outcome)

In addition to fixing whatever is going wrong with YARN, I may still:
* Have a couple of shellcheck issues to fix. Like $(dirname ${0}) seems tricky 
quote correctly to shellcheck's satisfaction.
* Add parameter checking as suggested by Ray
* Eliminate the need for a git checkout or installing expecting with apt-get
* Switch to using create-release - --native wasn't working because the Docker 
image doesn't have a high enough version of cmake


> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, 
> HDFS-11096.003.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-09-08 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16158780#comment-16158780
 ] 

Sean Mackrory commented on HDFS-11096:
--

Thanks for the reviews. I was hoping you'd take a look [~aw]! I'll update the 
patch and address these comments soon.

I've also been reviewing more recent JACC reports. There are still a few 
incompatibilities that technically violate the contract that I mentioned above, 
like metrics being replace by metrics2, s3:// disappearing entirely, but 
neither being labelled as deprecated for all of 2.x, some things that should 
not have been used publicly (like LOGs) changing data types, etc.  These are 
things that from a practical standpoint, have been known about by many for a 
long time and no concern has been raised, and there's significant baggage to 
addressing them. Does anybody think they warrant further action? I'm inclined 
to say no...

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-09-07 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157122#comment-16157122
 ] 

Sean Mackrory commented on HDFS-11096:
--

{quote}
It would be good to document the order of running the scripts (e.g. env.sh, 
call one of the *_cluster_env() functions then build-distributed-hadoop).
Add help documentation or document calling build-distributed-hadoop args.
It sources bash/functions.sh, which isn't visible if you're not running in 
dev-support/compat.
{quote}

So for these points, anything under bash/ is not meant to be run directly - 
it's all called by the main test scripts. So someone running tests should just 
need to run "./docker-rolling-upgrade.sh" from that directory. I've tried to 
have scripts meant to be run directly consistently ensure that their working 
directory is the one they're in so relatively paths are all fine. Adding 
documentation would be good, but it'd really be targeted at people adding new 
tests, not running the existing ones - so just wanted to make sure that was 
clear.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-09-06 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-11096:
-
Attachment: HDFS-11096.002.patch

Attaching a patch with a bunch of improvements:
* Addresses most (but not yet all) of the shellcheck warnings
* Uses www-us.apache.org to download released tarballs for Zookeeper and Hadoop 
instead of more specific ones. Not sure if there's a better URL to use... I'm 
also still using Github to check out trunk source as I had had issues using the 
official repo - I assume that's not entirely Kosher...
* Fixed it so in Docker deployments, tarballs are built on persistent storage 
to speed up future tests on the same host.
* Some refactoring of the pull-over-http-test as a step toward having more 
WebHDFS compatibility tests going both ways. Had some issues connecting to 
WebHDFS using the ns2 logical name, though. But what is currently in the patch 
all works.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-08-28 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143965#comment-16143965
 ] 

Sean Mackrory edited comment on HDFS-11096 at 8/28/17 4:40 PM:
---

So Docker support has been added for the rolling-upgrade and pull-over-http 
test. They're using the same Docker image as Yetus builds, etc. And they've 
been really robust lately. I've corrected the copyright headers at the top of 
the files, and I think dev-support/compat is a good place for these tests to 
live - but I'm open to other ideas as well. I've also added to the README - now 
that the scripts spin up the clusters on Docker, it's *really* easy to run 
these.

The Python tests are all still working, but they did not seem to catch the 
previous incompatibility that prevented older clients from writing to newer 
DataNodes. There's also still a few TODOs or thing that don't work and it's not 
clear why. So definitely more work to be done, but there's value in the 
existing CLI compatibility tests.

I'd like to get this put in the codebase and get some Jenkins jobs running on 
it soon.


was (Author: mackrorysd):
So Docker support has been added for the rolling-upgrade and pull-over-http 
test. They're using the same Docker image as Yetus builds, etc. And they've 
been really robust lately. I've corrected the copyright headers at the top of 
the files, and I think dev-support/compat is a good place for these tests to 
live - but I'm open to other ideas as well.

The Python tests are all still working, but they did not seem to catch the 
previous incompatibility that prevented older clients from writing to newer 
DataNodes. There's also still a few TODOs or thing that don't work and it's not 
clear why. So definitely more work to be done, but there's value in the 
existing CLI compatibility tests.

I'd like to get this put in the codebase and get some Jenkins jobs running on 
it soon.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-08-28 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-11096:
-
Status: Patch Available  (was: Open)

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-08-28 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-11096:
-
Attachment: HDFS-11096.001.patch

So Docker support has been added for the rolling-upgrade and pull-over-http 
test. They're using the same Docker image as Yetus builds, etc. And they've 
been really robust lately. I've corrected the copyright headers at the top of 
the files, and I think dev-support/compat is a good place for these tests to 
live - but I'm open to other ideas as well.

The Python tests are all still working, but they did not seem to catch the 
previous incompatibility that prevented older clients from writing to newer 
DataNodes. There's also still a few TODOs or thing that don't work and it's not 
clear why. So definitely more work to be done, but there's value in the 
existing CLI compatibility tests.

I'd like to get this put in the codebase and get some Jenkins jobs running on 
it soon.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-08-08 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118936#comment-16118936
 ] 

Sean Mackrory commented on HDFS-11096:
--

Just pushed some more updates. HDFS-12151 is fixed, and I'm once again able to 
do successful rolling upgrades, even with large delays during the upgrade. I 
made a quick attempt at doing a rolling upgrade of YARN, but the commands I 
thought one was supposed to use to stop / start daemons aren't working. The 
script is there, just not currently called from rolling-upgrade.sh. Not sure if 
you want to work off of that for YARN's side of things [~rchiang]?

I'm also working on adding Docker support and hopefully using the same Docker 
images we use for precommit jobs, etc. to make the build environment easier to 
put in place and make this more easily verifiable by others.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-9806) Allow HDFS block replicas to be provided by an external storage system

2017-08-08 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118654#comment-16118654
 ] 

Sean Mackrory edited comment on HDFS-9806 at 8/8/17 5:14 PM:
-

Could someone more familiar with the design here comment on any anticipated 
impact to NameNode scalability? If this feature is included but a user chooses 
not to use it (i.e. there are no storages of type PROVIDED but the capability 
is there) - is there any impact to memory consumed by the NameNode? I've read 
the design doc and some of the patches - I *think* things are good but wanting 
to be sure by someone more familiar with the implementation and who might have 
even tested that already...


was (Author: mackrorysd):
Could someone more familiar with the design here comment on any anticipated 
impact to NameNode scalability? If this feature is included but a user chooses 
not to use it (i.e. there are no storages of type PROVIDED) - is there any 
impact to memory consumed by the NameNode? I've read the design doc and some of 
the patches - I *think* things are good but wanting to be sure by someone more 
familiar with the implementation and who might have even tested that already...

> Allow HDFS block replicas to be provided by an external storage system
> --
>
> Key: HDFS-9806
> URL: https://issues.apache.org/jira/browse/HDFS-9806
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Douglas
> Attachments: HDFS-9806-design.001.pdf, HDFS-9806-design.002.pdf
>
>
> In addition to heterogeneous media, many applications work with heterogeneous 
> storage systems. The guarantees and semantics provided by these systems are 
> often similar, but not identical to those of 
> [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html].
>  Any client accessing multiple storage systems is responsible for reasoning 
> about each system independently, and must propagate/and renew credentials for 
> each store.
> Remote stores could be mounted under HDFS. Block locations could be mapped to 
> immutable file regions, opaque IDs, or other tokens that represent a 
> consistent view of the data. While correctness for arbitrary operations 
> requires careful coordination between stores, in practice we can provide 
> workable semantics with weaker guarantees.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9806) Allow HDFS block replicas to be provided by an external storage system

2017-08-08 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118654#comment-16118654
 ] 

Sean Mackrory commented on HDFS-9806:
-

Could someone more familiar with the design here comment on any anticipated 
impact to NameNode scalability? If this feature is included but a user chooses 
not to use it (i.e. there are no storages of type PROVIDED) - is there any 
impact to memory consumed by the NameNode? I've read the design doc and some of 
the patches - I *think* things are good but wanting to be sure by someone more 
familiar with the implementation and who might have even tested that already...

> Allow HDFS block replicas to be provided by an external storage system
> --
>
> Key: HDFS-9806
> URL: https://issues.apache.org/jira/browse/HDFS-9806
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Douglas
> Attachments: HDFS-9806-design.001.pdf, HDFS-9806-design.002.pdf
>
>
> In addition to heterogeneous media, many applications work with heterogeneous 
> storage systems. The guarantees and semantics provided by these systems are 
> often similar, but not identical to those of 
> [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html].
>  Any client accessing multiple storage systems is responsible for reasoning 
> about each system independently, and must propagate/and renew credentials for 
> each store.
> Remote stores could be mounted under HDFS. Block locations could be mapped to 
> immutable file regions, opaque IDs, or other tokens that represent a 
> consistent view of the data. While correctness for arbitrary operations 
> requires careful coordination between stores, in practice we can provide 
> workable semantics with weaker guarantees.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-08-01 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-12151:
-
   Resolution: Fixed
Fix Version/s: 3.0.0-beta1
   Status: Resolved  (was: Patch Available)

> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha4
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: 3.0.0-beta1
>
> Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, 
> HDFS-12151.003.patch, HDFS-12151.004.patch, HDFS-12151.005.patch, 
> HDFS-12151.006.patch, HDFS-12151.007.patch
>
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-08-01 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109715#comment-16109715
 ] 

Sean Mackrory commented on HDFS-12151:
--

Pushed, as [~andrew.wang]'s "pendings" have been resolved. Resolving... Thank 
you for the reviews!

> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha4
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, 
> HDFS-12151.003.patch, HDFS-12151.004.patch, HDFS-12151.005.patch, 
> HDFS-12151.006.patch, HDFS-12151.007.patch
>
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-07-31 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108308#comment-16108308
 ] 

Sean Mackrory commented on HDFS-12151:
--

Ah sorry about that - I seem to be blind to the yellow checkstyle warnings...

I did confirm that the test failures are flaky. They succeed locally and 
timeouts also occurred in the same class (often the same function) in several 
recent runs of the Pre-Commit jobs. Of course I just closed the editor where I 
noted all the URLs to said jobs, but they're there and they're recent, I 
promise :)

 Thanks for the review! Attaching a final patch with the checkstyle issues 
addressed. Reran the new test and the 2 that failed last time locally, and had 
a clean Yetus run.

> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha4
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, 
> HDFS-12151.003.patch, HDFS-12151.004.patch, HDFS-12151.005.patch, 
> HDFS-12151.006.patch, HDFS-12151.007.patch
>
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-07-31 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-12151:
-
Attachment: HDFS-12151.007.patch

> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha4
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, 
> HDFS-12151.003.patch, HDFS-12151.004.patch, HDFS-12151.005.patch, 
> HDFS-12151.006.patch, HDFS-12151.007.patch
>
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-07-28 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-12151:
-
Attachment: HDFS-12151.006.patch

So the problem was that it couldn't connect to a socket on the local port I was 
telling it to. I had originally had a dummy server as part of the NullDataNode 
class, but I later found it to be unnecessary, assuming that it was because it 
was only using the OutputStream I was passing in. The reason that worked is 
because I *happen* to have something listening on port 12345 locally. So I've 
restored the dummy server, and I'm now using a different port that I don't have 
anything listening on locally as well as intelligently switching the URLs and 
the dummy server to a different port if there ever is anything already on that 
port.

Attaching for a more serious test run...

> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha4
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, 
> HDFS-12151.003.patch, HDFS-12151.004.patch, HDFS-12151.005.patch, 
> HDFS-12151.006.patch
>
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-07-28 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-12151:
-
Attachment: HDFS-12151.005.patch

> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha4
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, 
> HDFS-12151.003.patch, HDFS-12151.004.patch, HDFS-12151.005.patch
>
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-07-27 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-12151:
-
Attachment: HDFS-12151.004.patch

I apologize for the noise everyone, but since I can't reproduce the test 
failure locally the next couple of patches are experimental and at least this 
one will still fail. There's another layer of swallowed exceptions before it's 
getting as far as the write and I need to see where it's getting thrown...

> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha4
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, 
> HDFS-12151.003.patch, HDFS-12151.004.patch
>
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-07-27 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103867#comment-16103867
 ] 

Sean Mackrory commented on HDFS-11096:
--

Updating: it was a recent change that broke this, I've posted a patch to fix it 
that's being reviewed / iterated on, and I've updated my rolling upgrade test 
scripts to actually confirm via the Job History Server that the jobs themselves 
were FINISHED and SUCCESSFUL.

I re-ran the test with an early patch and I was able to get a successful 
rolling upgrade with 5-10 minute delays between each step. So the entire 
rolling upgrade of a 9-node (6 worker-node) cluster was spread out over 4 hours 
and I didn't encounter any other issues, EXCEPT: in my test workload, I had to 
increase Terasort's output replication, because some job failures were 
occasionally happening when a job wrote to a node that was about to be taken 
down for upgrades. I fixed that and no other actual compatibility issues in 
Hadoop were found. I'll push the fixes out to Github soon...

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Lei (Eddy) Xu
>Priority: Blocker
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-07-27 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-12151:
-
Attachment: HDFS-12151.003.patch

Attaching a patch with the checkstyle issues fixed and also logging a stack 
trace for exceptions that happen earlier than required. I tried running 
parallel tests locally and didn't have a problem, but many other tests are 
failing because they think LOG fields are missing (but in the code, they're not 
- working on it). Also had a clean Yetus run locally, so I may be missing some 
config or something.

I don't want to handle RuntimeExceptions differently because it's  NPE that we 
receive in the case of the bug I'm fixing, and it's an NPE that we receive 
after data has been sent to the server because I haven't mocked everything. So 
if we receive an NPE before data is sent to the server, I'd like to treat it 
the same as any other exception and fail if it's too early.

> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha4
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, 
> HDFS-12151.003.patch
>
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-07-26 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-12151:
-
Attachment: HDFS-12151.002.patch

Attaching a test that ensures you can pass in null-ish values (at least the 
ones you get when you use a Hadoop 2 client) and get at least far enough to 
start writing things to the server. I just assume after that point you'll get 
an exception, because there's already a pretty ridiculous level of Mocking 
going on, and it would take so much more to get a complete writeBlocks() run. 
So we just abort as soon as there's an exception, and consider it successful if 
we got far enough that data had actually been sent over the network. The test 
fails without my most recent change and succeeds with it, and that should be 
true of HDFS-11956 too.

Patch 002 does not include any of the other changes being discussed (yet).

> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha4
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch
>
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-07-17 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-12151:
-
Affects Version/s: 3.0.0-alpha4
 Target Version/s: 3.0.0-beta1
   Status: Patch Available  (was: Open)

> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha4
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-12151.001.patch
>
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-07-17 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-12151:
-
Attachment: HDFS-12151.001.patch

Attaching trivial patch. Yetus complains that this doesn't include new tests, 
but this is caught by the rolling upgrade test I've been working on under 
HDFS-11096.

> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HDFS-12151.001.patch
>
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-07-17 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090677#comment-16090677
 ] 

Sean Mackrory commented on HDFS-12151:
--

The problem is in lines added by HDFS-9807. When serving clients from before 
that feature, no storage Ids will be provided, but we unconditionally address 
the first element in the array. I was able to do a successful rolling upgrade 
from HDFS 2 -> HDFS 3 by just checking the length first, and passing in null by 
default.

> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-07-17 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-12151:
-
Description: 
Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently fails. 
On the client side it looks like this:
{code}
17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}

But on the DataNode side there's an ArrayOutOfBoundsException because there 
aren't any targetStorageIds:
{code}
java.lang.ArrayIndexOutOfBoundsException: 0
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
at java.lang.Thread.run(Thread.java:745){code}

  was:
Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently fails. 
On the client side it looks like this:
{code}
17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}

But on the DataNode side there's an ArrayOutOfBoundsException because there 
aren't any targetStorageTypes:
{code}
java.lang.ArrayIndexOutOfBoundsException: 0
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
at java.lang.Thread.run(Thread.java:745){code}


> Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
> 
>
> Key: HDFS-12151
> URL: https://issues.apache.org/jira/browse/HDFS-12151
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rolling upgrades
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>
> Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently 
> fails. On the client side it looks like this:
> {code}
> 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in 
> createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
> at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}
> But on the DataNode side there's an ArrayOutOfBoundsException because there 
> aren't any targetStorageIds:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
> at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-07-17 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089926#comment-16089926
 ] 

Sean Mackrory commented on HDFS-11096:
--

Filed HDFS-12151 after looking into the logs deeper than I have in a while and 
seeing that stuff is failing once you start rolling the DataNodes. In a 
nutshell, Hadoop 2 clients can't write to Hadoop 3 DataNodes, so everything is 
falling apart at that point. I've been making the mistake of assuming 
all-processes-exit-0 => everything-is-working-great, but if jobs fails 
completely the CLI still returns 0. I do believe I've checked for actual 
success before, so I believe this did used to work and broke fairly recently, 
but I'm going to dig deeper. 

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Lei (Eddy) Xu
>Priority: Blocker
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes

2017-07-17 Thread Sean Mackrory (JIRA)
Sean Mackrory created HDFS-12151:


 Summary: Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
 Key: HDFS-12151
 URL: https://issues.apache.org/jira/browse/HDFS-12151
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Sean Mackrory
Assignee: Sean Mackrory


Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently fails. 
On the client side it looks like this:
{code}
17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code}

But on the DataNode side there's an ArrayOutOfBoundsException because there 
aren't any targetStorageTypes:
{code}
java.lang.ArrayIndexOutOfBoundsException: 0
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-07-10 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081348#comment-16081348
 ] 

Sean Mackrory commented on HDFS-11096:
--

That's a good idea, [~rchiang]. Adding in some delays between each step should 
be trivial and would dramatically increase the surface area of the test.

Also, regarding the YARN-3583 issue, it was fortunately addressed by YARN-6143. 
I have marked the JIRAs as related.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Lei (Eddy) Xu
>Priority: Blocker
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-07-10 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080798#comment-16080798
 ] 

Sean Mackrory commented on HDFS-11096:
--

For anyone following this thread, I've come back to this and pushed some 
updates to the tests at https://github.com/mackrorysd/hadoop-compatibility.
* fixes for some idempotence / SSH automation problems that could've popped up 
before in the rolling upgrade test, and actually validating the sorted data.
* [~eddyxu] wrote a little framework for writing tests against mini HDFS 
clusters of 2 different versions in Python, and a test that you can cp between 
2 clusters. I did a bit of refactoring and added tests that check for similar 
output for most of the "hdfs dfs" commands currently in Hadoop 2.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Lei (Eddy) Xu
>Priority: Blocker
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby

2017-06-21 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-10702:
-
Status: Open  (was: Patch Available)

> Add a Client API and Proxy Provider to enable stale read from Standby
> -
>
> Key: HDFS-10702
> URL: https://issues.apache.org/jira/browse/HDFS-10702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jiayi Zhou
>Assignee: Sean Mackrory
>Priority: Minor
> Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, 
> HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, 
> HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, 
> StaleReadfromStandbyNN.pdf
>
>
> Currently, clients must always talk to the active NameNode when performing 
> any metadata operation, which means active NameNode could be a bottleneck for 
> scalability. One way to solve this problem is to send read-only operations to 
> Standby NameNode. The disadvantage is that it might be a stale read. 
> Here, I'm thinking of adding a Client API to enable/disable stale read from 
> Standby which gives Client the power to set the staleness restriction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby

2017-06-21 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-10702:
-
Attachment: HDFS-10702.008.patch

Rebasing patch on more recent changes. I had also discovered that one of the 
tests was failing - if you disable stale reads client-side and a fail-over 
happens before a write-operation, you still might hit a standby namenode and 
the server might still return a response. I did a quick fix by also setting the 
minimum txId to Long.MAX_VALUE. Not sure that's how I want to fix it: it means 
anyone would have to reset the txId when enabling (although in practice I doubt 
that's a problem, as long as they know they have to, and I doubt that will be a 
common use case).

I'm going to look at how best to address the potential getBlockLocations issue 
as well.

> Add a Client API and Proxy Provider to enable stale read from Standby
> -
>
> Key: HDFS-10702
> URL: https://issues.apache.org/jira/browse/HDFS-10702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jiayi Zhou
>Assignee: Sean Mackrory
>Priority: Minor
> Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, 
> HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, 
> HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, 
> StaleReadfromStandbyNN.pdf
>
>
> Currently, clients must always talk to the active NameNode when performing 
> any metadata operation, which means active NameNode could be a bottleneck for 
> scalability. One way to solve this problem is to send read-only operations to 
> Standby NameNode. The disadvantage is that it might be a stale read. 
> Here, I'm thinking of adding a Client API to enable/disable stale read from 
> Standby which gives Client the power to set the staleness restriction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby

2017-06-20 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory reassigned HDFS-10702:


Assignee: Sean Mackrory  (was: Jiayi Zhou)

> Add a Client API and Proxy Provider to enable stale read from Standby
> -
>
> Key: HDFS-10702
> URL: https://issues.apache.org/jira/browse/HDFS-10702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jiayi Zhou
>Assignee: Sean Mackrory
>Priority: Minor
> Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, 
> HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, 
> HDFS-10702.006.patch, HDFS-10702.007.patch, StaleReadfromStandbyNN.pdf
>
>
> Currently, clients must always talk to the active NameNode when performing 
> any metadata operation, which means active NameNode could be a bottleneck for 
> scalability. One way to solve this problem is to send read-only operations to 
> Standby NameNode. The disadvantage is that it might be a stale read. 
> Here, I'm thinking of adding a Client API to enable/disable stale read from 
> Standby which gives Client the power to set the staleness restriction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11661) GetContentSummary uses excessive amounts of memory

2017-04-21 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979551#comment-15979551
 ] 

Sean Mackrory commented on HDFS-11661:
--

+1 to the revert - I too would still like to see the original problem fixed, 
but this is worse. It does indeed require global context to do correctly, so 
it'll require some cleverness to make sure we do that without using tons of 
space or locking for a long time. 

[~jojochuang] - to revert cleanly we can revert HDFS-11515 (unless I'm missing 
something and that patch does more than just correct the original changes in 
HDFS-10797) first and then HDFS-10797. As [~xiaochen] is not available right 
now, would you be able to commit the revert when we're satisfied? I'll run 
tests with the reverts committed locally...

> GetContentSummary uses excessive amounts of memory
> --
>
> Key: HDFS-11661
> URL: https://issues.apache.org/jira/browse/HDFS-11661
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 3.0.0-alpha2
>Reporter: Nathan Roberts
>Priority: Blocker
> Attachments: Heap growth.png
>
>
> ContentSummaryComputationContext::nodeIncluded() is being used to keep track 
> of all INodes visited during the current content summary calculation. This 
> can be all of the INodes in the filesystem, making for a VERY large hash 
> table. This simply won't work on large filesystems. 
> We noticed this after upgrading a namenode with ~100Million filesystem 
> objects was spending significantly more time in GC. Fortunately this system 
> had some memory breathing room, other clusters we have will not run with this 
> additional demand on memory.
> This was added as part of HDFS-10797 as a way of keeping track of INodes that 
> have already been accounted for - to avoid double counting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-01-10 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15816505#comment-15816505
 ] 

Sean Mackrory commented on HDFS-11096:
--

The getHdfsBlockLocations removal is documented in HDFS-8895 - apparently that 
had been deprecated. I filed HDFS-11312 and posted a patch for the nonDfsUsed 
discrepancy.


> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Priority: Blocker
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11312) Discrepancy in nonDfsUsed index in protobuf

2017-01-10 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HDFS-11312:
-
Attachment: HDFS-11312.001.patch

> Discrepancy in nonDfsUsed index in protobuf
> ---
>
> Key: HDFS-11312
> URL: https://issues.apache.org/jira/browse/HDFS-11312
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>Priority: Minor
> Attachments: HDFS-11312.001.patch
>
>
> The patches for HDFS-9038 had a discrepancy between trunk and branch-2.7: in 
> one message type, nonDfsUsed is given 2 different indices. This is a minor 
> wire incompatibility that is easy to fix...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11312) Discrepancy in nonDfsUsed index in protobuf

2017-01-10 Thread Sean Mackrory (JIRA)
Sean Mackrory created HDFS-11312:


 Summary: Discrepancy in nonDfsUsed index in protobuf
 Key: HDFS-11312
 URL: https://issues.apache.org/jira/browse/HDFS-11312
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Sean Mackrory
Assignee: Sean Mackrory
Priority: Minor


The patches for HDFS-9038 had a discrepancy between trunk and branch-2.7: in 
one message type, nonDfsUsed is given 2 different indices. This is a minor wire 
incompatibility that is easy to fix...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >