[jira] [Commented] (HDFS-14267) Add test_libhdfs_ops to libhdfs tests, mark libhdfs_read/write.c as examples
[ https://issues.apache.org/jira/browse/HDFS-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769481#comment-16769481 ] Sean Mackrory commented on HDFS-14267: -- +1, LGTM. I need to run tests before I commit and there's some issue with my protobuf dependency I need to work through... > Add test_libhdfs_ops to libhdfs tests, mark libhdfs_read/write.c as examples > > > Key: HDFS-14267 > URL: https://issues.apache.org/jira/browse/HDFS-14267 > Project: Hadoop HDFS > Issue Type: Improvement > Components: libhdfs, native, test >Reporter: Sahil Takiar >Priority: Major > Attachments: HDFS-14267.001.patch, HDFS-14267.002.patch > > > {{test_libhdfs_ops.c}} provides test coverage for basic operations against > libhdfs, but currently has to be run manually (e.g. {{mvn install}} does not > run these tests). The goal of this patch is to add {{test_libhdfs_ops.c}} to > the list of tests that are automatically run for libhdfs. > It looks like {{test_libhdfs_ops.c}} was used in conjunction with > {{hadoop-hdfs-project/hadoop-hdfs/src/main/native/tests/test-libhdfs.sh}} to > run some tests against a mini DFS cluster. Now that the > {{NativeMiniDfsCluster}} exists, it makes more sense to use that rather than > rely on an external bash script to start a mini DFS cluster. > The {{libhdfs-tests}} directory (which contains {{test_libhdfs_ops.c}}) > contains two other files: {{test_libhdfs_read.c}} and > {{test_libhdfs_write.c}}. At some point, these files might have been used in > conjunction with {{test-libhdfs.sh}} to run some tests manually. However, > they (1) largely overlap with the test coverage provided by > {{test_libhdfs_ops.c}} and (2) are not designed to be run as unit tests. Thus > I suggest we move these two files into a new folder called > {{libhdfs-examples}} and use them to further document how users of libhdfs > can use the API. We can move {{test-libhdfs.sh}} into the examples folder as > well given that example files probably require the script to actually work. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14101) Random failure of testListCorruptFilesCorruptedBlock
[ https://issues.apache.org/jira/browse/HDFS-14101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-14101: - Resolution: Fixed Fix Version/s: 3.3.0 Status: Resolved (was: Patch Available) > Random failure of testListCorruptFilesCorruptedBlock > > > Key: HDFS-14101 > URL: https://issues.apache.org/jira/browse/HDFS-14101 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.2.0, 3.0.3, 2.8.5 >Reporter: Kihwal Lee >Assignee: Zsolt Venczel >Priority: Major > Labels: newbie > Fix For: 3.3.0 > > Attachments: HDFS-14101.01.patch, HDFS-14101.02.patch > > > We've seen this occasionally. > {noformat} > java.lang.IllegalArgumentException: Negative position > at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:755) > at org.apache.hadoop.hdfs.server.namenode. > > TestListCorruptFileBlocks.testListCorruptFilesCorruptedBlock(TestListCorruptFileBlocks.java:105) > {noformat} > The test has a flaw. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14101) Random failure of testListCorruptFilesCorruptedBlock
[ https://issues.apache.org/jira/browse/HDFS-14101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714982#comment-16714982 ] Sean Mackrory commented on HDFS-14101: -- +1, thanks Zsolt. I think the shared variable is an improvement, but I still think it's worth a comment, so I'm gonna add the following add the declaration of corruptionLength if no one objects: {code} // Files are corrupted with 2 bytes before the end of the file, // so that's the minimum length {code} Of course, as I understand it, there's still a 1:65,536 chance the corruption isn't detected because the random bytes we overwrite with are identical to those that were there originally, but this is a step in the right direction :) > Random failure of testListCorruptFilesCorruptedBlock > > > Key: HDFS-14101 > URL: https://issues.apache.org/jira/browse/HDFS-14101 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.2.0, 3.0.3, 2.8.5 >Reporter: Kihwal Lee >Assignee: Zsolt Venczel >Priority: Major > Labels: newbie > Attachments: HDFS-14101.01.patch, HDFS-14101.02.patch > > > We've seen this occasionally. > {noformat} > java.lang.IllegalArgumentException: Negative position > at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:755) > at org.apache.hadoop.hdfs.server.namenode. > > TestListCorruptFileBlocks.testListCorruptFilesCorruptedBlock(TestListCorruptFileBlocks.java:105) > {noformat} > The test has a flaw. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13744) OIV tool should better handle control characters present in file or directory names
[ https://issues.apache.org/jira/browse/HDFS-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-13744: - Resolution: Fixed Fix Version/s: 3.2.0 Status: Resolved (was: Patch Available) > OIV tool should better handle control characters present in file or directory > names > --- > > Key: HDFS-13744 > URL: https://issues.apache.org/jira/browse/HDFS-13744 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, tools >Affects Versions: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Critical > Fix For: 3.2.0 > > Attachments: HDFS-13744.01.patch, HDFS-13744.02.patch, > HDFS-13744.03.patch > > > In certain cases when control characters or white space is present in file or > directory names OIV tool processors can export data in a misleading format. > In the below examples we have EXAMPLE_NAME as a file and a directory name > where the directory has a line feed character at the end (the actual > production case has multiple line feeds and multiple spaces) > * Delimited processor case: > ** misleading example: > {code:java} > /user/data/EXAMPLE_NAME > ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group > /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 > 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group > {code} > * > ** expected example as suggested by > [https://tools.ietf.org/html/rfc4180#section-2]: > {code:java} > "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31 > 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group > "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 > 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group > {code} > * XML processor case: > ** misleading example: > {code:java} > 479867791DIRECTORYEXAMPLE_NAME > 1493033668294user:group:0775 > 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674 > {code} > * > ** expected example as specified in > [https://www.w3.org/TR/REC-xml/#sec-line-ends]: > {code:java} > 479867791DIRECTORYEXAMPLE_NAME#xA1493033668294user:group:0775 > 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674 > {code} > * JSON: > The OIV Web Processor behaves correctly and produces the following: > {code:java} > { > "FileStatuses": { > "FileStatus": [ > { > "fileId": 113632535, > "accessTime": 1494954320141, > "replication": 3, > "owner": "user", > "length": 520, > "permission": "674", > "blockSize": 134217728, > "modificationTime": 1472205657504, > "type": "FILE", > "group": "group", > "childrenNum": 0, > "pathSuffix": "EXAMPLE_NAME" > }, > { > "fileId": 479867791, > "accessTime": 0, > "replication": 0, > "owner": "user", > "length": 0, > "permission": "775", > "blockSize": 0, > "modificationTime": 1493033668294, > "type": "DIRECTORY", > "group": "group", > "childrenNum": 0, > "pathSuffix": "EXAMPLE_NAME\n" > } > ] > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13744) OIV tool should better handle control characters present in file or directory names
[ https://issues.apache.org/jira/browse/HDFS-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607503#comment-16607503 ] Sean Mackrory commented on HDFS-13744: -- I concur. Committed. > OIV tool should better handle control characters present in file or directory > names > --- > > Key: HDFS-13744 > URL: https://issues.apache.org/jira/browse/HDFS-13744 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, tools >Affects Versions: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Critical > Attachments: HDFS-13744.01.patch, HDFS-13744.02.patch, > HDFS-13744.03.patch > > > In certain cases when control characters or white space is present in file or > directory names OIV tool processors can export data in a misleading format. > In the below examples we have EXAMPLE_NAME as a file and a directory name > where the directory has a line feed character at the end (the actual > production case has multiple line feeds and multiple spaces) > * Delimited processor case: > ** misleading example: > {code:java} > /user/data/EXAMPLE_NAME > ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group > /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 > 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group > {code} > * > ** expected example as suggested by > [https://tools.ietf.org/html/rfc4180#section-2]: > {code:java} > "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31 > 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group > "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 > 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group > {code} > * XML processor case: > ** misleading example: > {code:java} > 479867791DIRECTORYEXAMPLE_NAME > 1493033668294user:group:0775 > 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674 > {code} > * > ** expected example as specified in > [https://www.w3.org/TR/REC-xml/#sec-line-ends]: > {code:java} > 479867791DIRECTORYEXAMPLE_NAME#xA1493033668294user:group:0775 > 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674 > {code} > * JSON: > The OIV Web Processor behaves correctly and produces the following: > {code:java} > { > "FileStatuses": { > "FileStatus": [ > { > "fileId": 113632535, > "accessTime": 1494954320141, > "replication": 3, > "owner": "user", > "length": 520, > "permission": "674", > "blockSize": 134217728, > "modificationTime": 1472205657504, > "type": "FILE", > "group": "group", > "childrenNum": 0, > "pathSuffix": "EXAMPLE_NAME" > }, > { > "fileId": 479867791, > "accessTime": 0, > "replication": 0, > "owner": "user", > "length": 0, > "permission": "775", > "blockSize": 0, > "modificationTime": 1493033668294, > "type": "DIRECTORY", > "group": "group", > "childrenNum": 0, > "pathSuffix": "EXAMPLE_NAME\n" > } > ] > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13744) OIV tool should better handle control characters present in file or directory names
[ https://issues.apache.org/jira/browse/HDFS-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603615#comment-16603615 ] Sean Mackrory commented on HDFS-13744: -- Looks good to me, except CR/LF was being escaped as LF, so I attached .003. with a trivial change that escapes both characters. If it's cool with you, I'll commit that version. > OIV tool should better handle control characters present in file or directory > names > --- > > Key: HDFS-13744 > URL: https://issues.apache.org/jira/browse/HDFS-13744 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, tools >Affects Versions: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Critical > Attachments: HDFS-13744.01.patch, HDFS-13744.02.patch, > HDFS-13744.03.patch > > > In certain cases when control characters or white space is present in file or > directory names OIV tool processors can export data in a misleading format. > In the below examples we have EXAMPLE_NAME as a file and a directory name > where the directory has a line feed character at the end (the actual > production case has multiple line feeds and multiple spaces) > * Delimited processor case: > ** misleading example: > {code:java} > /user/data/EXAMPLE_NAME > ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group > /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 > 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group > {code} > * > ** expected example as suggested by > [https://tools.ietf.org/html/rfc4180#section-2]: > {code:java} > "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31 > 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group > "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 > 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group > {code} > * XML processor case: > ** misleading example: > {code:java} > 479867791DIRECTORYEXAMPLE_NAME > 1493033668294user:group:0775 > 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674 > {code} > * > ** expected example as specified in > [https://www.w3.org/TR/REC-xml/#sec-line-ends]: > {code:java} > 479867791DIRECTORYEXAMPLE_NAME#xA1493033668294user:group:0775 > 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674 > {code} > * JSON: > The OIV Web Processor behaves correctly and produces the following: > {code:java} > { > "FileStatuses": { > "FileStatus": [ > { > "fileId": 113632535, > "accessTime": 1494954320141, > "replication": 3, > "owner": "user", > "length": 520, > "permission": "674", > "blockSize": 134217728, > "modificationTime": 1472205657504, > "type": "FILE", > "group": "group", > "childrenNum": 0, > "pathSuffix": "EXAMPLE_NAME" > }, > { > "fileId": 479867791, > "accessTime": 0, > "replication": 0, > "owner": "user", > "length": 0, > "permission": "775", > "blockSize": 0, > "modificationTime": 1493033668294, > "type": "DIRECTORY", > "group": "group", > "childrenNum": 0, > "pathSuffix": "EXAMPLE_NAME\n" > } > ] > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13744) OIV tool should better handle control characters present in file or directory names
[ https://issues.apache.org/jira/browse/HDFS-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-13744: - Attachment: HDFS-13744.03.patch > OIV tool should better handle control characters present in file or directory > names > --- > > Key: HDFS-13744 > URL: https://issues.apache.org/jira/browse/HDFS-13744 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, tools >Affects Versions: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Critical > Attachments: HDFS-13744.01.patch, HDFS-13744.02.patch, > HDFS-13744.03.patch > > > In certain cases when control characters or white space is present in file or > directory names OIV tool processors can export data in a misleading format. > In the below examples we have EXAMPLE_NAME as a file and a directory name > where the directory has a line feed character at the end (the actual > production case has multiple line feeds and multiple spaces) > * Delimited processor case: > ** misleading example: > {code:java} > /user/data/EXAMPLE_NAME > ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group > /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 > 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group > {code} > * > ** expected example as suggested by > [https://tools.ietf.org/html/rfc4180#section-2]: > {code:java} > "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31 > 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group > "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 > 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group > {code} > * XML processor case: > ** misleading example: > {code:java} > 479867791DIRECTORYEXAMPLE_NAME > 1493033668294user:group:0775 > 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674 > {code} > * > ** expected example as specified in > [https://www.w3.org/TR/REC-xml/#sec-line-ends]: > {code:java} > 479867791DIRECTORYEXAMPLE_NAME#xA1493033668294user:group:0775 > 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674 > {code} > * JSON: > The OIV Web Processor behaves correctly and produces the following: > {code:java} > { > "FileStatuses": { > "FileStatus": [ > { > "fileId": 113632535, > "accessTime": 1494954320141, > "replication": 3, > "owner": "user", > "length": 520, > "permission": "674", > "blockSize": 134217728, > "modificationTime": 1472205657504, > "type": "FILE", > "group": "group", > "childrenNum": 0, > "pathSuffix": "EXAMPLE_NAME" > }, > { > "fileId": 479867791, > "accessTime": 0, > "replication": 0, > "owner": "user", > "length": 0, > "permission": "775", > "blockSize": 0, > "modificationTime": 1493033668294, > "type": "DIRECTORY", > "group": "group", > "childrenNum": 0, > "pathSuffix": "EXAMPLE_NAME\n" > } > ] > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13744) OIV tool should better handle control characters present in file or directory names
[ https://issues.apache.org/jira/browse/HDFS-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588981#comment-16588981 ] Sean Mackrory commented on HDFS-13744: -- We should probably also handle StringUtils.CR. Hadoop is sometimes used from Windows clients too. I'm a little bit torn about not escaping the XML. If someone is embedding control characters in filenames, even if that is technically allowed and there are standards specifying how that is to be encoded / decoded, I think it's likely to cause problems, and I would want those characters to show up obviously in a report. I suspect there's a good chance that those characters are the reason someone is trying to inspect the image in the first place :) But I also don't want to cause practical problems in XML parsers. I can see an argument either way - like I said I'm a bit torn and want to think about it... > OIV tool should better handle control characters present in file or directory > names > --- > > Key: HDFS-13744 > URL: https://issues.apache.org/jira/browse/HDFS-13744 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, tools >Affects Versions: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Critical > Attachments: HDFS-13744.01.patch > > > In certain cases when control characters or white space is present in file or > directory names OIV tool processors can export data in a misleading format. > In the below examples we have EXAMPLE_NAME as a file and a directory name > where the directory has a line feed character at the end (the actual > production case has multiple line feeds and multiple spaces) > * Delimited processor case: > ** misleading example: > {code:java} > /user/data/EXAMPLE_NAME > ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group > /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 > 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group > {code} > * > ** expected example as suggested by > [https://tools.ietf.org/html/rfc4180#section-2]: > {code:java} > "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31 > 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group > "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 > 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group > {code} > * XML processor case: > ** misleading example: > {code:java} > 479867791DIRECTORYEXAMPLE_NAME > 1493033668294user:group:0775 > 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674 > {code} > * > ** expected example as specified in > [https://www.w3.org/TR/REC-xml/#sec-line-ends]: > {code:java} > 479867791DIRECTORYEXAMPLE_NAME#xA1493033668294user:group:0775 > 113632535FILEEXAMPLE_NAME314722056575041494954320141134217728user:group:0674 > {code} > * JSON: > The OIV Web Processor behaves correctly and produces the following: > {code:java} > { > "FileStatuses": { > "FileStatus": [ > { > "fileId": 113632535, > "accessTime": 1494954320141, > "replication": 3, > "owner": "user", > "length": 520, > "permission": "674", > "blockSize": 134217728, > "modificationTime": 1472205657504, > "type": "FILE", > "group": "group", > "childrenNum": 0, > "pathSuffix": "EXAMPLE_NAME" > }, > { > "fileId": 479867791, > "accessTime": 0, > "replication": 0, > "owner": "user", > "length": 0, > "permission": "775", > "blockSize": 0, > "modificationTime": 1493033668294, > "type": "DIRECTORY", > "group": "group", > "childrenNum": 0, > "pathSuffix": "EXAMPLE_NAME\n" > } > ] > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13486) Backport HDFS-11817 (A faulty node can cause a lease leak and NPE on accessing data) to branch-2.7
[ https://issues.apache.org/jira/browse/HDFS-13486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551376#comment-16551376 ] Sean Mackrory commented on HDFS-13486: -- This can cause a similar failure to HDFS-7524 if you backport this without backporting HDFS-12299. > Backport HDFS-11817 (A faulty node can cause a lease leak and NPE on > accessing data) to branch-2.7 > -- > > Key: HDFS-13486 > URL: https://issues.apache.org/jira/browse/HDFS-13486 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > Fix For: 2.7.7 > > Attachments: HDFS-11817.branch-2.7.001.patch, > HDFS-11817.branch-2.7.002.patch > > > HDFS-11817 is a good fix to have in branch-2.7. > I'm taking a stab at it now. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13582) Improve backward compatibility for HDFS-13176 (WebHdfs file path gets truncated when having semicolon (;) inside)
[ https://issues.apache.org/jira/browse/HDFS-13582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-13582: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed your branch-2 patch. Thanks, [~zvenczel]! > Improve backward compatibility for HDFS-13176 (WebHdfs file path gets > truncated when having semicolon (;) inside) > - > > Key: HDFS-13582 > URL: https://issues.apache.org/jira/browse/HDFS-13582 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Fix For: 3.2.0 > > Attachments: HDFS-13582-branch-2.01.patch, HDFS-13582.01.patch, > HDFS-13582.02.patch > > > Encode special character only if necessary in order to improve backward > compatibility in the following scenario: > new (having HDFS-13176) WebHdfs client - > old (not having HDFS-13176) > WebHdfs server -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13582) Improve backward compatibility for HDFS-13176 (WebHdfs file path gets truncated when having semicolon (;) inside)
[ https://issues.apache.org/jira/browse/HDFS-13582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497543#comment-16497543 ] Sean Mackrory commented on HDFS-13582: -- Integration failures appear to be protoc version related again. I've pushed this fix to branch-3.1 since the original fix that introduced the incompatibility was also there. We also put that original commit in branch-2. [~zvenczel] - do you wanna post a version of this patch for branch-2 as well? > Improve backward compatibility for HDFS-13176 (WebHdfs file path gets > truncated when having semicolon (;) inside) > - > > Key: HDFS-13582 > URL: https://issues.apache.org/jira/browse/HDFS-13582 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Fix For: 3.2.0 > > Attachments: HDFS-13582.01.patch, HDFS-13582.02.patch > > > Encode special character only if necessary in order to improve backward > compatibility in the following scenario: > new (having HDFS-13176) WebHdfs client - > old (not having HDFS-13176) > WebHdfs server -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside
[ https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497542#comment-16497542 ] Sean Mackrory commented on HDFS-13176: -- Ok - I've gone ahead and pushed the followup fix to branch-3.1. > WebHdfs file path gets truncated when having semicolon (;) inside > - > > Key: HDFS-13176 > URL: https://issues.apache.org/jira/browse/HDFS-13176 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Fix For: 2.10.0, 3.2.0 > > Attachments: HDFS-13176-branch-2.01.patch, > HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, > HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, > HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.04.patch, > HDFS-13176-branch-2_yetus.log, HDFS-13176.01.patch, HDFS-13176.02.patch, > TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch > > > Find attached a patch having a test case that tries to reproduce the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside
[ https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496640#comment-16496640 ] Sean Mackrory commented on HDFS-13176: -- Note that the above failure is actually in reference to HDFS-13582 and not HDFS-13176 (this one). The failure appears to be because our protoc versions are messed up again. [~wangda] - I owe you an apology. I looked in the git-log to confirm where else I needed to put the HDFS-13582 patch that addresses this incompatibility. It appears that I committed this to branch-3.1 after code freeze (and I don't know why I did - but it definitely looks like I was the one that did it) before the 3.1.0 release. Would you prefer me to commit the HDFS-13582 fix to branch-3.1 for 3.1.0, or simply revert this HDFS-13176 patch that should never have been included in the first place? Let me know and I'll gladly take care of it. Again - my apologies! > WebHdfs file path gets truncated when having semicolon (;) inside > - > > Key: HDFS-13176 > URL: https://issues.apache.org/jira/browse/HDFS-13176 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Fix For: 2.10.0, 3.2.0 > > Attachments: HDFS-13176-branch-2.01.patch, > HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, > HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, > HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.04.patch, > HDFS-13176-branch-2_yetus.log, HDFS-13176.01.patch, HDFS-13176.02.patch, > TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch > > > Find attached a patch having a test case that tries to reproduce the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13121) NPE when request file descriptors when SC read
[ https://issues.apache.org/jira/browse/HDFS-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490768#comment-16490768 ] Sean Mackrory commented on HDFS-13121: -- I had an offline conversation with [~zvenczel] about the lack of tests - and testing this requires a pretty unreasonable level of refactoring and / or introducing new dependencies to do the mocking. One piece of feedback, though, is that I'd like to see a more helpful error message. The stack trace could show them that one of those fields was null, so if the text we pass to the exception could include, "This is often because Hadoop has exceeded the allowed number of open file descriptors" or something like that that hints at the likely root cause and possible solution would be good. > NPE when request file descriptors when SC read > -- > > Key: HDFS-13121 > URL: https://issues.apache.org/jira/browse/HDFS-13121 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 3.0.0 >Reporter: Gang Xie >Assignee: Zsolt Venczel >Priority: Minor > Attachments: HDFS-13121.01.patch > > > Recently, we hit an issue that the DFSClient throws NPE. The case is that, > the app process exceeds the limit of the max open file. In the case, the > libhadoop never throw and exception but return null to the request of fds. > But requestFileDescriptors use the returned fds directly without any check > and then NPE. > > We need add a sanity check here of null pointer. > > private ShortCircuitReplicaInfo requestFileDescriptors(DomainPeer peer, > Slot slot) throws IOException { > ShortCircuitCache cache = clientContext.getShortCircuitCache(); > final DataOutputStream out = > new DataOutputStream(new BufferedOutputStream(peer.getOutputStream())); > SlotId slotId = slot == null ? null : slot.getSlotId(); > new Sender(out).requestShortCircuitFds(block, token, slotId, 1, > failureInjector.getSupportsReceiptVerification()); > DataInputStream in = new DataInputStream(peer.getInputStream()); > BlockOpResponseProto resp = BlockOpResponseProto.parseFrom( > PBHelperClient.vintPrefixed(in)); > DomainSocket sock = peer.getDomainSocket(); > failureInjector.injectRequestFileDescriptorsFailure(); > switch (resp.getStatus()) { > case SUCCESS: > byte buf[] = new byte[1]; > FileInputStream[] fis = new FileInputStream[2]; > {color:#d04437}sock.recvFileInputStreams(fis, buf, 0, buf.length);{color} > ShortCircuitReplica replica = null; > try { > ExtendedBlockId key = > new ExtendedBlockId(block.getBlockId(), block.getBlockPoolId()); > if (buf[0] == USE_RECEIPT_VERIFICATION.getNumber()) { > LOG.trace("Sending receipt verification byte for slot {}", slot); > sock.getOutputStream().write(0); > } > {color:#d04437}replica = new ShortCircuitReplica(key, fis[0], fis[1], > cache,{color} > {color:#d04437} Time.monotonicNow(), slot);{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13582) Improve backward compatibility for HDFS-13176 (WebHdfs file path gets truncated when having semicolon (;) inside)
[ https://issues.apache.org/jira/browse/HDFS-13582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490717#comment-16490717 ] Sean Mackrory commented on HDFS-13582: -- Thanks for fixing that checkstyle issue. I think this is the right approach for what is kind of a messy situation: essentially we only modify paths that would have failed before, and use the previous behavior for all other paths, so any incompatibility between versions is only modifying how an existing problem manifests. +1 - will commit shortly. > Improve backward compatibility for HDFS-13176 (WebHdfs file path gets > truncated when having semicolon (;) inside) > - > > Key: HDFS-13582 > URL: https://issues.apache.org/jira/browse/HDFS-13582 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Fix For: 3.2.0 > > Attachments: HDFS-13582.01.patch, HDFS-13582.02.patch > > > Encode special character only if necessary in order to improve backward > compatibility in the following scenario: > new (having HDFS-13176) WebHdfs client - > old (not having HDFS-13176) > WebHdfs server -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside
[ https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-13176: - Resolution: Fixed Fix Version/s: 3.2.0 2.10.0 Status: Resolved (was: Patch Available) > WebHdfs file path gets truncated when having semicolon (;) inside > - > > Key: HDFS-13176 > URL: https://issues.apache.org/jira/browse/HDFS-13176 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Fix For: 2.10.0, 3.2.0 > > Attachments: HDFS-13176-branch-2.01.patch, > HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, > HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, > HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.04.patch, > HDFS-13176-branch-2_yetus.log, HDFS-13176.01.patch, HDFS-13176.02.patch, > TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch > > > Find attached a patch having a test case that tries to reproduce the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside
[ https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428344#comment-16428344 ] Sean Mackrory commented on HDFS-13176: -- +1 and committed to branch-2. Also ran Yetus, etc. myself. > WebHdfs file path gets truncated when having semicolon (;) inside > - > > Key: HDFS-13176 > URL: https://issues.apache.org/jira/browse/HDFS-13176 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13176-branch-2.01.patch, > HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, > HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.03.patch, > HDFS-13176-branch-2.03.patch, HDFS-13176-branch-2.04.patch, > HDFS-13176-branch-2_yetus.log, HDFS-13176.01.patch, HDFS-13176.02.patch, > TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch > > > Find attached a patch having a test case that tries to reproduce the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside
[ https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390173#comment-16390173 ] Sean Mackrory commented on HDFS-13176: -- I've pushed this to trunk. This would be good in branch-2, as well. It has a fairly trivial conflict thought, if you want to post a patch applied to branch-2 I can commit it - am about to run out for an errand. > WebHdfs file path gets truncated when having semicolon (;) inside > - > > Key: HDFS-13176 > URL: https://issues.apache.org/jira/browse/HDFS-13176 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13176.01.patch, HDFS-13176.02.patch, > TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch > > > Find attached a patch having a test case that tries to reproduce the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside
[ https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388560#comment-16388560 ] Sean Mackrory commented on HDFS-13176: -- In light of other ascii / unicode characters being legal, I added everything I could from a standard keyboard to the test, still passes. Sound good to you, [~zvenczel]? > WebHdfs file path gets truncated when having semicolon (;) inside > - > > Key: HDFS-13176 > URL: https://issues.apache.org/jira/browse/HDFS-13176 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13176.01.patch, HDFS-13176.02.patch, > TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch > > > Find attached a patch having a test case that tries to reproduce the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside
[ https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-13176: - Attachment: HDFS-13176.02.patch > WebHdfs file path gets truncated when having semicolon (;) inside > - > > Key: HDFS-13176 > URL: https://issues.apache.org/jira/browse/HDFS-13176 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13176.01.patch, HDFS-13176.02.patch, > TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch > > > Find attached a patch having a test case that tries to reproduce the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside
[ https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388339#comment-16388339 ] Sean Mackrory edited comment on HDFS-13176 at 3/6/18 9:13 PM: -- Thanks to [~anu] for digging up the appropriate documentation of legal paths - I had missed it looking at HDFS-specific stuff, but the documentation is Common-wide, which makes sense. The gist is that ':' is indeed illegal, but all of the other characters you're testing and the others I mentioned (that I didn't want to support) are supposed to work. +1 to your fix - I'll commit it but will give another day in case the folks you tagged want to chime in. We may as well add other characters that are supposed to work to verify they do and help keep it that way, too. edit: for context, this is what Anu linked to on the mailing list thread: http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/filesystem/introduction.html#Path_Names was (Author: mackrorysd): Thanks to [~anu] for digging up the appropriate documentation of legal paths - I had missed it looking at HDFS-specific stuff, but the documentation is Common-wide, which makes sense. The gist is that ':' is indeed illegal, but all of the other characters you're testing and the others I mentioned (that I didn't want to support) are supposed to work. +1 to your fix - I'll commit it but will give another day in case the folks you tagged want to chime in. We may as well add other characters that are supposed to work to verify they do and help keep it that way, too. > WebHdfs file path gets truncated when having semicolon (;) inside > - > > Key: HDFS-13176 > URL: https://issues.apache.org/jira/browse/HDFS-13176 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13176.01.patch, > TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch > > > Find attached a patch having a test case that tries to reproduce the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside
[ https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388339#comment-16388339 ] Sean Mackrory commented on HDFS-13176: -- Thanks to [~anu] for digging up the appropriate documentation of legal paths - I had missed it looking at HDFS-specific stuff, but the documentation is Common-wide, which makes sense. The gist is that ':' is indeed illegal, but all of the other characters you're testing and the others I mentioned (that I didn't want to support) are supposed to work. +1 to your fix - I'll commit it but will give another day in case the folks you tagged want to chime in. We may as well add other characters that are supposed to work to verify they do and help keep it that way, too. > WebHdfs file path gets truncated when having semicolon (;) inside > - > > Key: HDFS-13176 > URL: https://issues.apache.org/jira/browse/HDFS-13176 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13176.01.patch, > TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch > > > Find attached a patch having a test case that tries to reproduce the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside
[ https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386917#comment-16386917 ] Sean Mackrory edited comment on HDFS-13176 at 3/5/18 11:15 PM: --- At first glance, I love that this is addressing URL-specific weirdness in WebHDFS specifically and isn't modifying Path for the whole world. I'll do a deeper code review and make sure I do agree with everything, but I strongly suspect this is indeed the right way to solve this. One of the outcomes of this Jira I'd love to see is that we do finally clarify which paths are legal paths. ?, %, and \ make me nervous for the potential for them to be misinterpreted by related tools or future classes because of their role in URLs and Windows paths (which is exactly why : wouldn't work). There are a few other characters that I would personally avoid because of their role in scripts (I would have similar concerns about ~, *, `, @, !, $), etc, but I'd feel more comfortable agreeing to support the following subset of what you're currently testing: \"()[]_-=&+;{}#. Anyone else have thoughts along these lines? was (Author: mackrorysd): At first glance, I love that this is addressing URL-specific weirdness in WebHDFS specifically and isn't modifying Path for the whole world. I'll do a deeper code review and make sure I do agree with everything, but I strongly suspect this is indeed the right way to solve this. One of the outcomes of this Jira I'd love to see is that we do finally clarify which paths are legal paths. ?, %, and \\ make me nervous for the potential for them to be misinterpreted by related tools or future classes because of their role in URLs and Windows paths (which is exactly why : wouldn't work). There are a few other characters that I would personally avoid because of their role in scripts (I would have similar concerns about ~, *, `, @, !, $), etc, but I'd feel more comfortable agreeing to support the following subset of what you're currently testing: \"()[]_-=&+;{}#. Anyone else have thoughts along these lines? > WebHdfs file path gets truncated when having semicolon (;) inside > - > > Key: HDFS-13176 > URL: https://issues.apache.org/jira/browse/HDFS-13176 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13176.01.patch, > TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch > > > Find attached a patch having a test case that tries to reproduce the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside
[ https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386917#comment-16386917 ] Sean Mackrory commented on HDFS-13176: -- At first glance, I love that this is addressing URL-specific weirdness in WebHDFS specifically and isn't modifying Path for the whole world. I'll do a deeper code review and make sure I do agree with everything, but I strongly suspect this is indeed the right way to solve this. One of the outcomes of this Jira I'd love to see is that we do finally clarify which paths are legal paths. ?, %, and \\ make me nervous for the potential for them to be misinterpreted by related tools or future classes because of their role in URLs and Windows paths (which is exactly why : wouldn't work). There are a few other characters that I would personally avoid because of their role in scripts (I would have similar concerns about ~, *, `, @, !, $), etc, but I'd feel more comfortable agreeing to support the following subset of what you're currently testing: \"()[]_-=&+;{}#. Anyone else have thoughts along these lines? > WebHdfs file path gets truncated when having semicolon (;) inside > - > > Key: HDFS-13176 > URL: https://issues.apache.org/jira/browse/HDFS-13176 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13176.01.patch, > TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch > > > Find attached a patch having a test case that tries to reproduce the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13176) WebHdfs file path gets truncated when having semicolon (;) inside
[ https://issues.apache.org/jira/browse/HDFS-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371738#comment-16371738 ] Sean Mackrory commented on HDFS-13176: -- If we can fix this without breaking anything else, great, because we know there are people doing this. Personally, I would avoid using characters like this at all costs because of how many parsing problems they might cause in scripts, encodings, whatever. But the fact is I don't think we've ever been clear about what makes a legal path. We often compare it to POSIX, which I think allows anything printable except that / has a special meaning, but it's also needing to work in browsers as a URL, which may make for some interesting rules. I've had a look through a few sources and I don't see any definition of what should work - and we should probably come up with a set definition of what paths are legal and supported. > WebHdfs file path gets truncated when having semicolon (;) inside > - > > Key: HDFS-13176 > URL: https://issues.apache.org/jira/browse/HDFS-13176 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: TestWebHdfsUrl.testWebHdfsSpecialCharacterFile.patch > > > Find attached a patch having a test case that tries to reproduce the problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13135) Lease not deleted when deleting INodeReference
[ https://issues.apache.org/jira/browse/HDFS-13135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361465#comment-16361465 ] Sean Mackrory commented on HDFS-13135: -- So this test more or less reproduces what I was seeing. I'm still trying to get more info about the workload that did this, because it seems insane, but the same RPC client ID was appending to a file, and then deleting it, and upon starting back up we got a NullPointerException because there was a lease for an inode that didn't exist anymore. I'm uncertain about whether or not the fix is correct here: a lot of the code this is calling is completely new to me, so it's entirely possible there are side effects I haven't considered (like whether or not this causes data that should not be cleaned up because it's needed for the s0 snapshot to get deleted). > Lease not deleted when deleting INodeReference > -- > > Key: HDFS-13135 > URL: https://issues.apache.org/jira/browse/HDFS-13135 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HDFS-13135.001.patch > > > In troubleshooting an occurrence of HDFS-13115, it seemed that there was > another underlying root cause that should also be addressed. There was an > INodeReference that was deleted and the lease on it was not subsequently > deleted because it was never added to the reclaim context. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13135) Lease not deleted when deleting INodeReference
[ https://issues.apache.org/jira/browse/HDFS-13135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-13135: - Attachment: HDFS-13135.001.patch > Lease not deleted when deleting INodeReference > -- > > Key: HDFS-13135 > URL: https://issues.apache.org/jira/browse/HDFS-13135 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Major > Attachments: HDFS-13135.001.patch > > > In troubleshooting an occurrence of HDFS-13115, it seemed that there was > another underlying root cause that should also be addressed. There was an > INodeReference that was deleted and the lease on it was not subsequently > deleted because it was never added to the reclaim context. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-13135) Lease not deleted when deleting INodeReference
Sean Mackrory created HDFS-13135: Summary: Lease not deleted when deleting INodeReference Key: HDFS-13135 URL: https://issues.apache.org/jira/browse/HDFS-13135 Project: Hadoop HDFS Issue Type: Bug Reporter: Sean Mackrory Assignee: Sean Mackrory In troubleshooting an occurrence of HDFS-13115, it seemed that there was another underlying root cause that should also be addressed. There was an INodeReference that was deleted and the lease on it was not subsequently deleted because it was never added to the reclaim context. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13106) Need to exercise all HDFS APIs for EC
[ https://issues.apache.org/jira/browse/HDFS-13106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-13106: - Resolution: Fixed Status: Resolved (was: Patch Available) > Need to exercise all HDFS APIs for EC > - > > Key: HDFS-13106 > URL: https://issues.apache.org/jira/browse/HDFS-13106 > Project: Hadoop HDFS > Issue Type: Test > Components: hdfs >Affects Versions: 3.0.0 >Reporter: Haibo Yan >Assignee: Haibo Yan >Priority: Major > Attachments: HDFS-13106.001.patch, HDFS-13106.002.patch, > HDFS-13106.003.patch > > > Exercise FileSystem API to make sure all APIs works as expected under Erasure > Coding feature enabled -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13106) Need to exercise all HDFS APIs for EC
[ https://issues.apache.org/jira/browse/HDFS-13106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354767#comment-16354767 ] Sean Mackrory commented on HDFS-13106: -- Committed to trunk. > Need to exercise all HDFS APIs for EC > - > > Key: HDFS-13106 > URL: https://issues.apache.org/jira/browse/HDFS-13106 > Project: Hadoop HDFS > Issue Type: Test > Components: hdfs >Affects Versions: 3.0.0 >Reporter: Haibo Yan >Assignee: Haibo Yan >Priority: Major > Attachments: HDFS-13106.001.patch, HDFS-13106.002.patch, > HDFS-13106.003.patch > > > Exercise FileSystem API to make sure all APIs works as expected under Erasure > Coding feature enabled -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13106) Need to exercise all HDFS APIs for EC
[ https://issues.apache.org/jira/browse/HDFS-13106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354728#comment-16354728 ] Sean Mackrory commented on HDFS-13106: -- +1 from me too. Unit test failures are unrelated, new tests work for me locally too. Can commit shortly. > Need to exercise all HDFS APIs for EC > - > > Key: HDFS-13106 > URL: https://issues.apache.org/jira/browse/HDFS-13106 > Project: Hadoop HDFS > Issue Type: Test > Components: hdfs >Affects Versions: 3.0.0 >Reporter: Haibo Yan >Assignee: Haibo Yan >Priority: Major > Attachments: HDFS-13106.001.patch, HDFS-13106.002.patch, > HDFS-13106.003.patch > > > Exercise FileSystem API to make sure all APIs works as expected under Erasure > Coding feature enabled -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13106) Need to exercise all HDFS APIs for EC
[ https://issues.apache.org/jira/browse/HDFS-13106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16353187#comment-16353187 ] Sean Mackrory commented on HDFS-13106: -- The HDFS test failures are unrelated: you only add tests, and those tests aren't the ones failing. We should address the checkstyle issues, though. > Need to exercise all HDFS APIs for EC > - > > Key: HDFS-13106 > URL: https://issues.apache.org/jira/browse/HDFS-13106 > Project: Hadoop HDFS > Issue Type: Test > Components: hdfs >Affects Versions: 3.0.0 >Reporter: Haibo Yan >Assignee: Haibo Yan >Priority: Major > Attachments: HDFS-13106.001.patch > > > Exercise FileSystem API to make sure all APIs works as expected under Erasure > Coding feature enabled -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13106) Need to exercise all HDFS APIs for EC
[ https://issues.apache.org/jira/browse/HDFS-13106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16353174#comment-16353174 ] Sean Mackrory commented on HDFS-13106: -- Looks good to me. We need a clean Yetus run, though - I suspect it needs your patch to be named HDFS-13106*.001*.patch. I'm also not much of an EC expert, so I wouldn't notice if there were any special edge cases that deserved special attention here, but I think this is safe enough to commit once we have a clean Yetus run. > Need to exercise all HDFS APIs for EC > - > > Key: HDFS-13106 > URL: https://issues.apache.org/jira/browse/HDFS-13106 > Project: Hadoop HDFS > Issue Type: Test > Components: hdfs >Affects Versions: 3.0.0 >Reporter: Haibo Yan >Assignee: Haibo Yan >Priority: Major > Attachments: HDFS-13106.001.patch > > > Exercise FileSystem API to make sure all APIs works as expected under Erasure > Coding feature enabled -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12913) TestDNFencingWithReplication.testFencingStress fix mini cluster not yet active issue
[ https://issues.apache.org/jira/browse/HDFS-12913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-12913: - Resolution: Fixed Fix Version/s: 3.0.1 3.1.0 Status: Resolved (was: Patch Available) > TestDNFencingWithReplication.testFencingStress fix mini cluster not yet > active issue > > > Key: HDFS-12913 > URL: https://issues.apache.org/jira/browse/HDFS-12913 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel > Labels: flaky-test > Fix For: 3.1.0, 3.0.1 > > Attachments: HDFS-12913.01.patch, HDFS-12913.02.patch > > > Once in every 5000 test run the following issue happens: > {code} > 2017-12-11 10:33:09 [INFO] > 2017-12-11 10:33:09 [INFO] > --- > 2017-12-11 10:33:09 [INFO] T E S T S > 2017-12-11 10:33:09 [INFO] > --- > 2017-12-11 10:33:09 [INFO] Running > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication > 2017-12-11 10:37:32 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, > Time elapsed: 262.641 s <<< FAILURE! - in > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication > 2017-12-11 10:37:32 [ERROR] > testFencingStress(org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication) > Time elapsed: 262.477 s <<< ERROR! > 2017-12-11 10:37:32 java.lang.RuntimeException: Deferred > 2017-12-11 10:37:32 at > org.apache.hadoop.test.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:130) > 2017-12-11 10:37:32 at > org.apache.hadoop.test.MultithreadedTestUtil$TestContext.stop(MultithreadedTestUtil.java:166) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress(TestDNFencingWithReplication.java:137) > 2017-12-11 10:37:32 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > 2017-12-11 10:37:32 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2017-12-11 10:37:32 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2017-12-11 10:37:32 at java.lang.reflect.Method.invoke(Method.java:498) > 2017-12-11 10:37:32 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > 2017-12-11 10:37:32 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2017-12-11 10:37:32 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > 2017-12-11 10:37:32 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > 2017-12-11 10:37:32 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > 2017-12-11 10:37:32 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner.run(ParentRunner.java:309) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407) > 2017-12-11 10:37:32 Caused by: java.lang.RuntimeException: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported
[jira] [Commented] (HDFS-12913) TestDNFencingWithReplication.testFencingStress fix mini cluster not yet active issue
[ https://issues.apache.org/jira/browse/HDFS-12913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311468#comment-16311468 ] Sean Mackrory commented on HDFS-12913: -- +1 - will merge today. > TestDNFencingWithReplication.testFencingStress fix mini cluster not yet > active issue > > > Key: HDFS-12913 > URL: https://issues.apache.org/jira/browse/HDFS-12913 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel > Labels: flaky-test > Attachments: HDFS-12913.01.patch, HDFS-12913.02.patch > > > Once in every 5000 test run the following issue happens: > {code} > 2017-12-11 10:33:09 [INFO] > 2017-12-11 10:33:09 [INFO] > --- > 2017-12-11 10:33:09 [INFO] T E S T S > 2017-12-11 10:33:09 [INFO] > --- > 2017-12-11 10:33:09 [INFO] Running > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication > 2017-12-11 10:37:32 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, > Time elapsed: 262.641 s <<< FAILURE! - in > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication > 2017-12-11 10:37:32 [ERROR] > testFencingStress(org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication) > Time elapsed: 262.477 s <<< ERROR! > 2017-12-11 10:37:32 java.lang.RuntimeException: Deferred > 2017-12-11 10:37:32 at > org.apache.hadoop.test.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:130) > 2017-12-11 10:37:32 at > org.apache.hadoop.test.MultithreadedTestUtil$TestContext.stop(MultithreadedTestUtil.java:166) > 2017-12-11 10:37:32 at > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress(TestDNFencingWithReplication.java:137) > 2017-12-11 10:37:32 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > 2017-12-11 10:37:32 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2017-12-11 10:37:32 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2017-12-11 10:37:32 at java.lang.reflect.Method.invoke(Method.java:498) > 2017-12-11 10:37:32 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > 2017-12-11 10:37:32 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2017-12-11 10:37:32 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > 2017-12-11 10:37:32 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > 2017-12-11 10:37:32 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > 2017-12-11 10:37:32 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > 2017-12-11 10:37:32 at > org.junit.runners.ParentRunner.run(ParentRunner.java:309) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119) > 2017-12-11 10:37:32 at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407) > 2017-12-11 10:37:32 Caused by: java.lang.RuntimeException: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > 2017-12-11 10:37:32 at >
[jira] [Commented] (HDFS-12891) Do not invalidate blocks if toInvalidate is empty
[ https://issues.apache.org/jira/browse/HDFS-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287773#comment-16287773 ] Sean Mackrory commented on HDFS-12891: -- I see I'm late to the party, but I started testing this the other day and can confirm I was seeing approximately 1 in 30 runs fail and that this fixes it. +1 to the patch as committed. > Do not invalidate blocks if toInvalidate is empty > - > > Key: HDFS-12891 > URL: https://issues.apache.org/jira/browse/HDFS-12891 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel > Labels: flaky-test > Fix For: 3.1.0, 3.0.1 > > Attachments: HDFS-12891.01.patch, HDFS-12891.02.patch > > > {code:java} > java.lang.AssertionError: Test resulted in an unexpected exit > at > org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress(TestDNFencingWithReplication.java:147) > : > : > 2017-10-19 21:39:40,068 [main] INFO hdfs.MiniDFSCluster > (MiniDFSCluster.java:shutdown(1965)) - Shutting down the Mini HDFS Cluster > 2017-10-19 21:39:40,068 [main] FATAL hdfs.MiniDFSCluster > (MiniDFSCluster.java:shutdown(1968)) - Test resulted in an unexpected exit > 1: java.lang.AssertionError > at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:265) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4437) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.AssertionError > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor.addBlocksToBeInvalidated(DatanodeDescriptor.java:641) > at > org.apache.hadoop.hdfs.server.blockmanagement.InvalidateBlocks.invalidateWork(InvalidateBlocks.java:299) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.invalidateWorkForOneNode(BlockManager.java:4246) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeInvalidateWork(BlockManager.java:1736) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4561) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4418) > ... 1 more > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-12892) TestClusterTopology#testChooseRandom fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-12892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory reassigned HDFS-12892: Assignee: Zsolt Venczel > TestClusterTopology#testChooseRandom fails intermittently > - > > Key: HDFS-12892 > URL: https://issues.apache.org/jira/browse/HDFS-12892 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.0.0 >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel > Labels: flaky-test > > Flaky test failure: > {code:java} > java.lang.AssertionError > Error > Not choosing nodes randomly > Stack Trace > java.lang.AssertionError: Not choosing nodes randomly > at > org.apache.hadoop.net.TestClusterTopology.testChooseRandom(TestClusterTopology.java:170) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby
[ https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278900#comment-16278900 ] Sean Mackrory commented on HDFS-10702: -- Any other comments on the consistency issue, [~mingma] / [~zhz]? I'd like to finish this up soon if there aren't any other concerns - the configuration idea in my last comment can be addressed later without causing any problem - I'd rather just keep this patch small until we can come to consensus on what's already in it. > Add a Client API and Proxy Provider to enable stale read from Standby > - > > Key: HDFS-10702 > URL: https://issues.apache.org/jira/browse/HDFS-10702 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jiayi Zhou >Assignee: Sean Mackrory >Priority: Minor > Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, > HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, > HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, > StaleReadfromStandbyNN.pdf > > > Currently, clients must always talk to the active NameNode when performing > any metadata operation, which means active NameNode could be a bottleneck for > scalability. One way to solve this problem is to send read-only operations to > Standby NameNode. The disadvantage is that it might be a stale read. > Here, I'm thinking of adding a Client API to enable/disable stale read from > Standby which gives Client the power to set the staleness restriction. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246667#comment-16246667 ] Sean Mackrory commented on HDFS-11096: -- Note that the fix to the sbin/start-yarn.sh shell script has now been committed spearately as YARN-7465. I'm working with [~rchiang] to figure out why YARN is failing after the rolling upgrade - it works in both Hadoop 2 and Hadoop 3 in the clusters used for the distcp test - it's only after the upgrade... > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, > HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch, > HDFS-11096.006.patch, HDFS-11096.007.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234197#comment-16234197 ] Sean Mackrory commented on HDFS-11096: -- >From an HDFS standpoint, definitely - I've run many successful rolling upgrade >and distcp-over-webhdfs tests this week and updated the patch. The only thing >remaining is to get automation itself in place after this is committed. I looked into the YARN issues. I'm still seeing very similar symptoms to the YARN-6457 issue mentioned above in both branch-3.0 and trunk. In trunk I'm also seeing this: {quote} 17/10/31 23:05:49 INFO security.AMRMTokenSecretManager: Creating password for appattempt_1509490231144_0628_02 17/10/31 23:05:49 INFO amlauncher.AMLauncher: Error launching appattempt_1509490231144_0628_02. Got exception: org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid container token used for starting container on : container-5.docker:35151 at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.verifyAndGetContainerTokenIdentifier(ContainerManagerImpl.java:974) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:789) at org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagementProtocolPBServiceImpl.startContainers(ContainerManagementProtocolPBServiceImpl.java:70) at org.apache.hadoop.yarn.proto.ContainerManagementProtocol$ContainerManagementProtocolService$2.callBlockingMethod(ContainerManagementProtocol.java:127) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455) at sun.reflect.GeneratedConstructorAccessor70.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80) at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:131) at sun.reflect.GeneratedMethodAccessor85.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) at com.sun.proxy.$Proxy89.startContainers(Unknown Source) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:123) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:304) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Invalid container token used for starting container on : container-5.docker:35151 at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.verifyAndGetContainerTokenIdentifier(ContainerManagerImpl.java:974) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:789) at org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagementProtocolPBServiceImpl.startContainers(ContainerManagementProtocolPBServiceImpl.java:70) at
[jira] [Comment Edited] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234197#comment-16234197 ] Sean Mackrory edited comment on HDFS-11096 at 11/1/17 3:16 PM: --- >From an HDFS standpoint, definitely - I've run many successful rolling upgrade >and distcp-over-webhdfs tests this week and updated the patch. The only thing >remaining is to get automation itself in place after this is committed. I looked into the YARN issues. I'm still seeing very similar symptoms to the YARN-6457 issue mentioned above in both branch-3.0 and trunk. In trunk I'm also seeing this: {code} 17/10/31 23:05:49 INFO security.AMRMTokenSecretManager: Creating password for appattempt_1509490231144_0628_02 17/10/31 23:05:49 INFO amlauncher.AMLauncher: Error launching appattempt_1509490231144_0628_02. Got exception: org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid container token used for starting container on : container-5.docker:35151 at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.verifyAndGetContainerTokenIdentifier(ContainerManagerImpl.java:974) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:789) at org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagementProtocolPBServiceImpl.startContainers(ContainerManagementProtocolPBServiceImpl.java:70) at org.apache.hadoop.yarn.proto.ContainerManagementProtocol$ContainerManagementProtocolService$2.callBlockingMethod(ContainerManagementProtocol.java:127) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455) at sun.reflect.GeneratedConstructorAccessor70.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80) at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:131) at sun.reflect.GeneratedMethodAccessor85.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) at com.sun.proxy.$Proxy89.startContainers(Unknown Source) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:123) at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:304) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Invalid container token used for starting container on : container-5.docker:35151 at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.verifyAndGetContainerTokenIdentifier(ContainerManagerImpl.java:974) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:789) at org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagementProtocolPBServiceImpl.startContainers(ContainerManagementProtocolPBServiceImpl.java:70) at
[jira] [Commented] (HDFS-206) Support for head in FSShell
[ https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226903#comment-16226903 ] Sean Mackrory commented on HDFS-206: Committed to trunk. Did you want to get this into any other versions too? > Support for head in FSShell > --- > > Key: HDFS-206 > URL: https://issues.apache.org/jira/browse/HDFS-206 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Gabor Bota >Priority: Minor > Labels: newbie > Fix For: 3.1.0 > > Attachments: HDFS-206.001.patch, HDFS-206.002.patch, > HDFS-206.003.patch > > > For Pig project, we would like to integrate head and tail commands into our > shell (Grunt). I could find tail but not head command -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-206) Support for head in FSShell
[ https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-206: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Support for head in FSShell > --- > > Key: HDFS-206 > URL: https://issues.apache.org/jira/browse/HDFS-206 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Gabor Bota >Priority: Minor > Labels: newbie > Fix For: 3.1.0 > > Attachments: HDFS-206.001.patch, HDFS-206.002.patch, > HDFS-206.003.patch > > > For Pig project, we would like to integrate head and tail commands into our > shell (Grunt). I could find tail but not head command -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-206) Support for head in FSShell
[ https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-206: --- Fix Version/s: 3.1.0 > Support for head in FSShell > --- > > Key: HDFS-206 > URL: https://issues.apache.org/jira/browse/HDFS-206 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Gabor Bota >Priority: Minor > Labels: newbie > Fix For: 3.1.0 > > Attachments: HDFS-206.001.patch, HDFS-206.002.patch, > HDFS-206.003.patch > > > For Pig project, we would like to integrate head and tail commands into our > shell (Grunt). I could find tail but not head command -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-206) Support for head in FSShell
[ https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226885#comment-16226885 ] Sean Mackrory commented on HDFS-206: Thanks for resubmitting - I tried manually running the Jenkins job last night and it failed because some processes were killed... The test failures do look very unrelated - +1, committing... > Support for head in FSShell > --- > > Key: HDFS-206 > URL: https://issues.apache.org/jira/browse/HDFS-206 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Gabor Bota >Priority: Minor > Labels: newbie > Attachments: HDFS-206.001.patch, HDFS-206.002.patch, > HDFS-206.003.patch > > > For Pig project, we would like to integrate head and tail commands into our > shell (Grunt). I could find tail but not head command -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7878) API - expose an unique file identifier
[ https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225910#comment-16225910 ] Sean Mackrory commented on HDFS-7878: - No other feedback from me - +1 on the bits I understand, like Steve :) > API - expose an unique file identifier > -- > > Key: HDFS-7878 > URL: https://issues.apache.org/jira/browse/HDFS-7878 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: BB2015-05-TBR > Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, > HDFS-7878.03.patch, HDFS-7878.04.patch, HDFS-7878.05.patch, > HDFS-7878.06.patch, HDFS-7878.07.patch, HDFS-7878.08.patch, > HDFS-7878.09.patch, HDFS-7878.10.patch, HDFS-7878.11.patch, > HDFS-7878.12.patch, HDFS-7878.13.patch, HDFS-7878.14.patch, > HDFS-7878.15.patch, HDFS-7878.16.patch, HDFS-7878.17.patch, > HDFS-7878.18.patch, HDFS-7878.19.patch, HDFS-7878.20.patch, > HDFS-7878.21.patch, HDFS-7878.patch > > > See HDFS-487. > Even though that is resolved as duplicate, the ID is actually not exposed by > the JIRA it supposedly duplicates. > INode ID for the file should be easy to expose; alternatively ID could be > derived from block IDs, to account for appends... > This is useful e.g. for cache key by file, to make sure cache stays correct > when file is overwritten. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-206) Support for head in FSShell
[ https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225886#comment-16225886 ] Sean Mackrory commented on HDFS-206: Yetus never reviewed your 3rd patch, and Jenkins now appears to be getting restarted. I'll check back tomorrow and follow-up if Yetus doesn't review it overnight. > Support for head in FSShell > --- > > Key: HDFS-206 > URL: https://issues.apache.org/jira/browse/HDFS-206 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Gabor Bota >Priority: Minor > Labels: newbie > Attachments: HDFS-206.001.patch, HDFS-206.002.patch, > HDFS-206.003.patch > > > For Pig project, we would like to integrate head and tail commands into our > shell (Grunt). I could find tail but not head command -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-206) Support for head in FSShell
[ https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225278#comment-16225278 ] Sean Mackrory commented on HDFS-206: Ah - you know what? I didn't see that the last argument passed to copyBytes was to close the file after the copy. Still, I like the idea of having a resource closed at the end of the same function it was created in so it's easy to see its lifecycle, so I like the .003 patch. +1 - I'll commit your .003. patch later today unless anyone else weighs in. > Support for head in FSShell > --- > > Key: HDFS-206 > URL: https://issues.apache.org/jira/browse/HDFS-206 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Gabor Bota >Priority: Minor > Labels: newbie > Attachments: HDFS-206.001.patch, HDFS-206.002.patch, > HDFS-206.003.patch > > > For Pig project, we would like to integrate head and tail commands into our > shell (Grunt). I could find tail but not head command -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225110#comment-16225110 ] Sean Mackrory commented on HDFS-11096: -- Re: test4tests, the whole patch is tests. Should figure out how to tell Yetus that these are a kind of tests at some point. Re: asflicense, those files are not involved with my patch. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, > HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch, > HDFS-11096.006.patch, HDFS-11096.007.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-206) Support for head in FSShell
[ https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225096#comment-16225096 ] Sean Mackrory edited comment on HDFS-206 at 10/30/17 3:11 PM: -- This looks pretty good to me. A couple of nitpicks: * You're still documenting the -f option for head in USAGE, although none exists. * We should have a finally \{\} to close the file. Unlikely to ever cause a problem in practice here, but good practice and easy enough to fix right now. It's been too long for me to see the last test results, but we'll check again when you upload the next patch. Unless it was TestDFSShell failing I think it's extremely unlikely to be your patch that broke something. was (Author: mackrorysd): This looks pretty good to me. A couple of nitpicks: * You're still documenting the -f option for head in USAGE, although none exists. * We should have a finally \{\} to close the file. Unlikely to ever cause a problem in practice here, but good practice and easy enough to fix right now. It's been too long for me to see the last test results, but we'll check again when you upload the next patch. Unless it was TestDFSShell failing I think it's extremely unlikely to be your patch that broke something. > Support for head in FSShell > --- > > Key: HDFS-206 > URL: https://issues.apache.org/jira/browse/HDFS-206 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Gabor Bota >Priority: Minor > Labels: newbie > Attachments: HDFS-206.001.patch, HDFS-206.002.patch > > > For Pig project, we would like to integrate head and tail commands into our > shell (Grunt). I could find tail but not head command -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-206) Support for head in FSShell
[ https://issues.apache.org/jira/browse/HDFS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225096#comment-16225096 ] Sean Mackrory commented on HDFS-206: This looks pretty good to me. A couple of nitpicks: * You're still documenting the -f option for head in USAGE, although none exists. * We should have a finally \{\} to close the file. Unlikely to ever cause a problem in practice here, but good practice and easy enough to fix right now. It's been too long for me to see the last test results, but we'll check again when you upload the next patch. Unless it was TestDFSShell failing I think it's extremely unlikely to be your patch that broke something. > Support for head in FSShell > --- > > Key: HDFS-206 > URL: https://issues.apache.org/jira/browse/HDFS-206 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Gabor Bota >Priority: Minor > Labels: newbie > Attachments: HDFS-206.001.patch, HDFS-206.002.patch > > > For Pig project, we would like to integrate head and tail commands into our > shell (Grunt). I could find tail but not head command -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-11096: - Attachment: HDFS-11096.007.patch Okay - I'm now pretty happy with how this is working. I saw the last shellcheck problems locally, and have fixed those, too. I've had several successful test runs of both of the Docker tests in the last few days, and this is looking pretty reliable to me: * Versions to test are now specified via CLI args to the Docker scripts. That way this only has to change in code when there's a bug to fix or other improvement to make: Jenkins jobs can be updated for various version combinations independently. * Fixing more ZK timeouts, this time in YARN. I've disabled the YARN rolling upgrade as that appears to be troublesome again. But the HDFS upgrade is working and YARN / MR is working well during and after that upgrade. I'll keep troubleshooting the YARN side, but that can be a separate JIRA. * Logs are now saved to ./logs/ back on the host to facilitate more debugging after the Docker images have been destroyed in the event of a failure. Although I've made a number of fixes as documented in the comments, not much has changed that would invalidate the value of previous code reviews, IMO. [~aw] - have I addressed the issues you pointed out to your satisfaction? > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, > HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch, > HDFS-11096.006.patch, HDFS-11096.007.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7878) API - expose an unique file identifier
[ https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220898#comment-16220898 ] Sean Mackrory commented on HDFS-7878: - {quote}Wire compatibility with HDFS{quote} How do commented out lines ensure wire compatibility? It would make sense if these were obsolete fields and we didn't want to reuse obsolete number in case older messages get misinterpreted, but then we should be reusing. Nevertheless, it appears we're not doing that in the latest patch anymore. I do think this resolves my previous concerns with the patch. In testCrossSerializationProto and testJavaSerialization we're removing assertions that the PathHandle to what should be the same file should be identical. Isn't that still true, and should be? > API - expose an unique file identifier > -- > > Key: HDFS-7878 > URL: https://issues.apache.org/jira/browse/HDFS-7878 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: BB2015-05-TBR > Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, > HDFS-7878.03.patch, HDFS-7878.04.patch, HDFS-7878.05.patch, > HDFS-7878.06.patch, HDFS-7878.07.patch, HDFS-7878.08.patch, > HDFS-7878.09.patch, HDFS-7878.10.patch, HDFS-7878.11.patch, > HDFS-7878.12.patch, HDFS-7878.13.patch, HDFS-7878.14.patch, > HDFS-7878.15.patch, HDFS-7878.16.patch, HDFS-7878.17.patch, > HDFS-7878.18.patch, HDFS-7878.patch > > > See HDFS-487. > Even though that is resolved as duplicate, the ID is actually not exposed by > the JIRA it supposedly duplicates. > INode ID for the file should be easy to expose; alternatively ID could be > derived from block IDs, to account for appends... > This is useful e.g. for cache key by file, to make sure cache stays correct > when file is overwritten. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-11096: - Attachment: HDFS-11096.006.patch It's possible, but will be tough. I worked with [~rchiang] to get past the YARN issues I was having. By specifying both hostname (required by shell scripts) and the address (hostname + ports) for all of the YARN ports, I was able to get it to work. I feel this is possibly an incompatible change in YARN, being that YARN works fine by just specifying the hostname (as long as everything's going to use the default ports) in Hadoop 2.x, but I'll leave that [~rchiang]'s judgement if there's a good enough reason and we can put some documentation in place. Specifying the ports in a Hadoop 2.x cluster prior to upgrade wouldn't be too bad. I then repeatedly encountered a lot of failures due to timeouts with both ZooKeeper and JournalNodes. I increased a couple of timeouts and was able to get it working reliably again. Other changes in the revision I'm posting (.006) right now: * where it applies to both YARN and HDFS, I've stopped used NAMENODES and DATANODES, but MASTERS and WORKERS * I fixed the sole shellcheck issue above. It was not raised locally, so my version must be out of sync. Can't confirm until Yetus does that I've eliminated others * I've added more distcp-over-webhdfs tests: to, from, and on both old and new clusters.They're all working perfeclt. Currently the only issue I see is that the ResourceManager port 8032 stops listening towards the end of the rolling upgrade test. ResourceManager does not log any problems, and I don't see any other issues. But after we stop all the loops of MapReduce jobs that were running during the rolling upgrade, we can't query the job history to confirm they were all successful, because it can't connect to :8032 on either node. Other ResourceManager services are still listening. This happens even if I comment out the YARN rolling upgrade step. I may need to get some more help from [~rchiang] debugging that again. I'm also going to try running this against branch-3.0 instead of trunk, to eliminate some instability I may be seeing. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, > HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch, > HDFS-11096.006.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-11096: - Attachment: HDFS-11096.005.patch * Restored the old hadoop_actual_ssh. I did try a few cases of multi-token commands through workers.sh and they didn't behave the same with my function if you didn't quote the command or something. I'm trying to think of a better way to differentiate between that function and my own that just takes quoted, multi-line strings and therefore shouldn't be escaping the same way. They're currently named hadoop_ssh_cmd (old one - expects a single command passed all the way from workers.sh) and hadoop_ssh_cmds (new one - expects a quote script passed directly from another script). But I'm not happy with that either. * Fixed a bunch of issues when running as a non-root user. Some of these were in the pull-over-http test, but one of them was actually in start-yarn.sh where an arg is missing in a non-root branch of the script. * Also fixed multi-cluster config in the pull-over-http.sh. They ended up using the same YARN cluster ID when talking to the ZooKeeper cluster. * Added license headers to several of the scripts, as the build started failing without them. I am now again blocked on what may be a bug in YARN or MR. The MapReduce classpath seems all messed up in both tests. It wants me to specify HADOOP_MAPRED_CLASSPATH in mapred-site.xml (even though the inferred value is correct), and even then it can't find log4j.properties or yarn-site.xml. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, > HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12595) Isolation of native libraries in JARs
Sean Mackrory created HDFS-12595: Summary: Isolation of native libraries in JARs Key: HDFS-12595 URL: https://issues.apache.org/jira/browse/HDFS-12595 Project: Hadoop HDFS Issue Type: Bug Reporter: Sean Mackrory Assignee: Sean Mackrory There is a native library embedded in the Netty JAR. Even with shading, this can cause conflicts if a user application is used a different version of Netty. Hadoop does not use the native implementations, so we could just remove it, or we could relocate it more intelligently. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188872#comment-16188872 ] Sean Mackrory commented on HDFS-11096: -- Thanks, [~atm]! I have also not forgotten the comment about the SSH function likely breaking the existing usage. I'll check into that... > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, > HDFS-11096.003.patch, HDFS-11096.004.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-11096: - Attachment: HDFS-11096.004.patch .004 patch: * Added more helpful logging so set -v and set -x aren't needed, and removed those. * Changed to run as non-root - Hadoop environment was actually not set up because it was root, so now I can actually use create-release. This involved quite a big change to the Docker wrapper scripts to move EVERYTHING that require root privileges into separate steps before kicking off the actual test run that no longer used or needed root at all. * Locally at least, I have now fixed *all* shellcheck issues. I actually feel pretty strongly about keeping set -e here. I usually like to use "if ...; then" for handling likely errors, but I do see your point about not having a way to have responses conditional on the specific error code. But in this case every step is required and there's not a ton we can do to recover from a failed command that wouldn't require fixes to the tests or the Hadoop from a human. The alternative is to have test runs advance far beyond the root cause of failure and be harder to troubleshoot, or wrap a lot of unnecessary checks for success around (quite literally) everything. Thoughts? Currently YARN rolling upgrades are actually failing because the Hadoop 3 ResourceManager is using URIs like http://ns1:8020/... (logical namespace name used as though it's a hostname). Not sure where that's coming from yet, I need to dig there, but I doubt the fix there will invalidate much review of the current code. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, > HDFS-11096.003.patch, HDFS-11096.004.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7878) API - expose an unique file identifier
[ https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181721#comment-16181721 ] Sean Mackrory commented on HDFS-7878: - {quote}Neither API has been part of a release, yet. {quote} But it is included in a branch that's being prepared for a release. Unless we can still squeeze this in to beta-1 (and we may indeed be a bit late for that - [~andrew.wang]?), I would assume this is headed for 3.1 or maybe 3.0 - and either way it would need to keep a constructor with the same prototype as what currently exists in branch-3.0. Right? On an unrelated note, I second your thoughts that it should be open(FileHandle) and not open(FileStatus). But other than that, I'd +1 the rest of the patch. > API - expose an unique file identifier > -- > > Key: HDFS-7878 > URL: https://issues.apache.org/jira/browse/HDFS-7878 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: BB2015-05-TBR > Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, > HDFS-7878.03.patch, HDFS-7878.04.patch, HDFS-7878.05.patch, > HDFS-7878.06.patch, HDFS-7878.07.patch, HDFS-7878.08.patch, > HDFS-7878.09.patch, HDFS-7878.10.patch, HDFS-7878.11.patch, > HDFS-7878.12.patch, HDFS-7878.patch > > > See HDFS-487. > Even though that is resolved as duplicate, the ID is actually not exposed by > the JIRA it supposedly duplicates. > INode ID for the file should be easy to expose; alternatively ID could be > derived from block IDs, to account for appends... > This is useful e.g. for cache key by file, to make sure cache stays correct > when file is overwritten. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby
[ https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181648#comment-16181648 ] Sean Mackrory commented on HDFS-10702: -- I was just chatting with [~yzhangal] about this feature and he had a cool suggestion: a configuration that eliminates any need for code changes in an application to use this. I see 2 options here: - a configuration that causes the minimum transaction ID to be set at 0 by default (i.e. just trust that the standby NN's are already sufficiently up to date) - a configuration that triggers the HDFS client retrieving the latest transaction ID and setting it as the minimum, so that results are at least as fresh as the client itself. > Add a Client API and Proxy Provider to enable stale read from Standby > - > > Key: HDFS-10702 > URL: https://issues.apache.org/jira/browse/HDFS-10702 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jiayi Zhou >Assignee: Sean Mackrory >Priority: Minor > Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, > HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, > HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, > StaleReadfromStandbyNN.pdf > > > Currently, clients must always talk to the active NameNode when performing > any metadata operation, which means active NameNode could be a bottleneck for > scalability. One way to solve this problem is to send read-only operations to > Standby NameNode. The disadvantage is that it might be a stale read. > Here, I'm thinking of adding a Client API to enable/disable stale read from > Standby which gives Client the power to set the staleness restriction. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-7878) API - expose an unique file identifier
[ https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181634#comment-16181634 ] Sean Mackrory edited comment on HDFS-7878 at 9/26/17 9:41 PM: -- Looks like this patch is violating some of compatibility guarantees. The following is already annotated as Public and Stable in branch-3.0, but is removed in this patch: {code} public FileStatus(long, boolean, int, long, long, long, FsPermission, String, String, Path, Path, boolean, boolean, boolean) {code} Can we make sure a function with that prototype is added back? LocatedFileStatus is a similar situation, although it's "Evolving": {code} public LocatedFileStatus(long, boolean, int, long, long, long, FsPermission, String, String, Path, Path, boolean, boolean, boolean, BlockLocation[]) {code} Could someone also enlighten me as to the purpose of the commented out lines in FSProtos.proto? I thought it was odd that we were replacing "alias = 13", but I have no idea why that line is there in the first place. was (Author: mackrorysd): Looks like this patch is violating some of compatibility guarantees. The following is already annotated as Public and Stable in branch-3.0, but is removed in this patch: {code} public FileStatus(long, boolean, int, long, long, long, FsPermission, String, String, Path, Path, boolean, boolean, boolean) {code} Can we make sure a function with that prototype is added back? LocatedFileStatus is a similar situation, although it's "Evolving": {code} public LocatedFileStatus(long length, boolean isdir, int, long, long, long, FsPermission, String, String, Path, Path, boolean, boolean, boolean, BlockLocation[] locations) {code} Could someone also enlighten me as to the purpose of the commented out lines in FSProtos.proto? I thought it was odd that we were replacing "alias = 13", but I have no idea why that line is there in the first place. > API - expose an unique file identifier > -- > > Key: HDFS-7878 > URL: https://issues.apache.org/jira/browse/HDFS-7878 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: BB2015-05-TBR > Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, > HDFS-7878.03.patch, HDFS-7878.04.patch, HDFS-7878.05.patch, > HDFS-7878.06.patch, HDFS-7878.07.patch, HDFS-7878.08.patch, > HDFS-7878.09.patch, HDFS-7878.10.patch, HDFS-7878.11.patch, > HDFS-7878.12.patch, HDFS-7878.patch > > > See HDFS-487. > Even though that is resolved as duplicate, the ID is actually not exposed by > the JIRA it supposedly duplicates. > INode ID for the file should be easy to expose; alternatively ID could be > derived from block IDs, to account for appends... > This is useful e.g. for cache key by file, to make sure cache stays correct > when file is overwritten. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7878) API - expose an unique file identifier
[ https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181634#comment-16181634 ] Sean Mackrory commented on HDFS-7878: - Looks like this patch is violating some of compatibility guarantees. The following is already annotated as Public and Stable in branch-3.0, but is removed in this patch: {code} public FileStatus(long, boolean, int, long, long, long, FsPermission, String, String, Path, Path, boolean, boolean, boolean) {code} Can we make sure a function with that prototype is added back? LocatedFileStatus is a similar situation, although it's "Evolving": {code} public LocatedFileStatus(long length, boolean isdir, int, long, long, long, FsPermission, String, String, Path, Path, boolean, boolean, boolean, BlockLocation[] locations) {code} Could someone also enlighten me as to the purpose of the commented out lines in FSProtos.proto? I thought it was odd that we were replacing "alias = 13", but I have no idea why that line is there in the first place. > API - expose an unique file identifier > -- > > Key: HDFS-7878 > URL: https://issues.apache.org/jira/browse/HDFS-7878 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: BB2015-05-TBR > Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, > HDFS-7878.03.patch, HDFS-7878.04.patch, HDFS-7878.05.patch, > HDFS-7878.06.patch, HDFS-7878.07.patch, HDFS-7878.08.patch, > HDFS-7878.09.patch, HDFS-7878.10.patch, HDFS-7878.11.patch, > HDFS-7878.12.patch, HDFS-7878.patch > > > See HDFS-487. > Even though that is resolved as duplicate, the ID is actually not exposed by > the JIRA it supposedly duplicates. > INode ID for the file should be easy to expose; alternatively ID could be > derived from block IDs, to account for appends... > This is useful e.g. for cache key by file, to make sure cache stays correct > when file is overwritten. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby
[ https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16171883#comment-16171883 ] Sean Mackrory edited comment on HDFS-10702 at 9/19/17 3:38 PM: --- The assumption of this feature is that an application is responsible for knowing when a dataset is stable enough to work on, and that events that happen after that transaction ID may affect the accuracy of the results as seen by the application. There are obviously cases where it isn't reasonable for an application to make an assumption like that, but like I said above, this isn't intended for every situation. That said, I'd be all for testing the sequence you described to verify exactly how it fails and that it doesn't bring all of HDFS down with it - just the client. But if a file is deleted after the specified transaction ID and the application tries to access it, returning an exception would be the correct behavior, IMO. I was actually wondering if what you meant was the block locations were out of date because the file had been re-replicated in a different configuration due to cluster health issues, or decommissioning. Cluster state is distinct from an application knowing when it's safe to assume that a dataset is finalized, so that complicates the assumption somewhat. But if it's just a clearly stated assumption that this feature transfers responsibility for knowing that a dataset is complete to the client application and we test the accessing a deleted file fails in a correct manner, would that address your concerns, [~mingma]? was (Author: mackrorysd): The assumption of this feature is that an application is responsible for knowing when a dataset is stable enough to work on, and that any failures or inaccuracies resulting in stuff that happens after the minimum transaction ID is assumed by the application. There are obviously case where that's not reasonable, but like I said above, this isn't intended for every situation. That said, I'd be all for testing the sequence you described to verify exactly how it fails and that it doesn't bring all of HDFS down with it - just the client. But if a file is deleted after the specified transaction ID and the application tries to access it, returning an exception would be the correct behavior, IMO. I was actually wondering if what you meant was the block locations were out of date because the file had been re-replicated in a different configuration due to cluster health issues, or decommissioning. Cluster state is distinct from an application knowing when it's safe to assume that a dataset is finalized, so that complicates the assumption somewhat. But if it's just a clearly stated assumption that this feature transfers responsibility for knowing that a dataset is complete to the client application and we test the accessing a deleted file fails in a correct manner, would that address your concerns, [~mingma]? > Add a Client API and Proxy Provider to enable stale read from Standby > - > > Key: HDFS-10702 > URL: https://issues.apache.org/jira/browse/HDFS-10702 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jiayi Zhou >Assignee: Sean Mackrory >Priority: Minor > Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, > HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, > HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, > StaleReadfromStandbyNN.pdf > > > Currently, clients must always talk to the active NameNode when performing > any metadata operation, which means active NameNode could be a bottleneck for > scalability. One way to solve this problem is to send read-only operations to > Standby NameNode. The disadvantage is that it might be a stale read. > Here, I'm thinking of adding a Client API to enable/disable stale read from > Standby which gives Client the power to set the staleness restriction. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby
[ https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16171883#comment-16171883 ] Sean Mackrory edited comment on HDFS-10702 at 9/19/17 3:36 PM: --- The assumption of this feature is that an application is responsible for knowing when a dataset is stable enough to work on, and that any failures or inaccuracies resulting in stuff that happens after the minimum transaction ID is assumed by the application. There are obviously case where that's not reasonable, but like I said above, this isn't intended for every situation. That said, I'd be all for testing the sequence you described to verify exactly how it fails and that it doesn't bring all of HDFS down with it - just the client. But if a file is deleted after the specified transaction ID and the application tries to access it, returning an exception would be the correct behavior, IMO. I was actually wondering if what you meant was the block locations were out of date because the file had been re-replicated in a different configuration due to cluster health issues, or decommissioning. Cluster state is distinct from an application knowing when it's safe to assume that a dataset is finalized, so that complicates the assumption somewhat. But if it's just a clearly stated assumption that this feature transfers responsibility for knowing that a dataset is complete to the client application and we test the accessing a deleted file fails in a correct manner, would that address your concerns, [~mingma]? was (Author: mackrorysd): The assumption of this feature is that an application is responsible for knowing when a dataset is stable enough to work on, and that any failures or inaccuracies resulting in stuff that happens after the minimum transaction ID is assumed by the application. That said, I'd be all for testing the scenario above to verify exactly how it fails and that it doesn't bring all of HDFS down with it - just the client. But if file is deleted after the specified transaction and the application tries to access it, returning an exception would be the correct behavior. I was actually wondering if what you meant was the block locations were out of date because the file had been re-replicated in a different configuration due to cluster health issues, or decommissioning. Cluster state is distinct from an application knowing when it's safe to assume that a dataset is finalized, so that complicates the assumption somewhat. But if it's just a clearly stated assumption that this feature transfers reponsibility for knowing that a dataset is complete to the client application and we test the accessing a deleted file fails in a correct manner, would that address your concerns, [~mingma]? > Add a Client API and Proxy Provider to enable stale read from Standby > - > > Key: HDFS-10702 > URL: https://issues.apache.org/jira/browse/HDFS-10702 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jiayi Zhou >Assignee: Sean Mackrory >Priority: Minor > Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, > HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, > HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, > StaleReadfromStandbyNN.pdf > > > Currently, clients must always talk to the active NameNode when performing > any metadata operation, which means active NameNode could be a bottleneck for > scalability. One way to solve this problem is to send read-only operations to > Standby NameNode. The disadvantage is that it might be a stale read. > Here, I'm thinking of adding a Client API to enable/disable stale read from > Standby which gives Client the power to set the staleness restriction. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby
[ https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16171883#comment-16171883 ] Sean Mackrory commented on HDFS-10702: -- The assumption of this feature is that an application is responsible for knowing when a dataset is stable enough to work on, and that any failures or inaccuracies resulting in stuff that happens after the minimum transaction ID is assumed by the application. That said, I'd be all for testing the scenario above to verify exactly how it fails and that it doesn't bring all of HDFS down with it - just the client. But if file is deleted after the specified transaction and the application tries to access it, returning an exception would be the correct behavior. I was actually wondering if what you meant was the block locations were out of date because the file had been re-replicated in a different configuration due to cluster health issues, or decommissioning. Cluster state is distinct from an application knowing when it's safe to assume that a dataset is finalized, so that complicates the assumption somewhat. But if it's just a clearly stated assumption that this feature transfers reponsibility for knowing that a dataset is complete to the client application and we test the accessing a deleted file fails in a correct manner, would that address your concerns, [~mingma]? > Add a Client API and Proxy Provider to enable stale read from Standby > - > > Key: HDFS-10702 > URL: https://issues.apache.org/jira/browse/HDFS-10702 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jiayi Zhou >Assignee: Sean Mackrory >Priority: Minor > Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, > HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, > HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, > StaleReadfromStandbyNN.pdf > > > Currently, clients must always talk to the active NameNode when performing > any metadata operation, which means active NameNode could be a bottleneck for > scalability. One way to solve this problem is to send read-only operations to > Standby NameNode. The disadvantage is that it might be a stale read. > Here, I'm thinking of adding a Client API to enable/disable stale read from > Standby which gives Client the power to set the staleness restriction. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-11096: - Attachment: HDFS-11096.003.patch So attaching a 3rd patch. YARN appears to be failing, so I need to debug this, but this should includes a lot of changes based on the feedback here and I doubt the fix will be significant. Most notably: * Augmented documentation to clear up any confusion about how to run the tests and what their main components are. * Removed any use of deprecated commands, and switched to 3.1.0-SNAPSHOT * Removed set -x. [~aw] - how do you feel about set -v? And did you mean set +e? I feel like if it's okay for a command to sometimes fail, you can deal with that return code explicitly, otherwise I'd like that failure to bubble up. Am I missing something? * Switched to using hadoop-functions and added what I needed there. Not sure I understand exactly what hadoop_actual_ssh was supposed to be doing before, but it's not used elsewhere and is marked as private, so I hope my change to it is okay. I redid the join / split functions to make shellcheck much happier (and I'm also much happier with the outcome) In addition to fixing whatever is going wrong with YARN, I may still: * Have a couple of shellcheck issues to fix. Like $(dirname ${0}) seems tricky quote correctly to shellcheck's satisfaction. * Add parameter checking as suggested by Ray * Eliminate the need for a git checkout or installing expecting with apt-get * Switch to using create-release - --native wasn't working because the Docker image doesn't have a high enough version of cmake > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, > HDFS-11096.003.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16158780#comment-16158780 ] Sean Mackrory commented on HDFS-11096: -- Thanks for the reviews. I was hoping you'd take a look [~aw]! I'll update the patch and address these comments soon. I've also been reviewing more recent JACC reports. There are still a few incompatibilities that technically violate the contract that I mentioned above, like metrics being replace by metrics2, s3:// disappearing entirely, but neither being labelled as deprecated for all of 2.x, some things that should not have been used publicly (like LOGs) changing data types, etc. These are things that from a practical standpoint, have been known about by many for a long time and no concern has been raised, and there's significant baggage to addressing them. Does anybody think they warrant further action? I'm inclined to say no... > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157122#comment-16157122 ] Sean Mackrory commented on HDFS-11096: -- {quote} It would be good to document the order of running the scripts (e.g. env.sh, call one of the *_cluster_env() functions then build-distributed-hadoop). Add help documentation or document calling build-distributed-hadoop args. It sources bash/functions.sh, which isn't visible if you're not running in dev-support/compat. {quote} So for these points, anything under bash/ is not meant to be run directly - it's all called by the main test scripts. So someone running tests should just need to run "./docker-rolling-upgrade.sh" from that directory. I've tried to have scripts meant to be run directly consistently ensure that their working directory is the one they're in so relatively paths are all fine. Adding documentation would be good, but it'd really be targeted at people adding new tests, not running the existing ones - so just wanted to make sure that was clear. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-11096: - Attachment: HDFS-11096.002.patch Attaching a patch with a bunch of improvements: * Addresses most (but not yet all) of the shellcheck warnings * Uses www-us.apache.org to download released tarballs for Zookeeper and Hadoop instead of more specific ones. Not sure if there's a better URL to use... I'm also still using Github to check out trunk source as I had had issues using the official repo - I assume that's not entirely Kosher... * Fixed it so in Docker deployments, tarballs are built on persistent storage to speed up future tests on the same host. * Some refactoring of the pull-over-http-test as a step toward having more WebHDFS compatibility tests going both ways. Had some issues connecting to WebHDFS using the ns2 logical name, though. But what is currently in the patch all works. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143965#comment-16143965 ] Sean Mackrory edited comment on HDFS-11096 at 8/28/17 4:40 PM: --- So Docker support has been added for the rolling-upgrade and pull-over-http test. They're using the same Docker image as Yetus builds, etc. And they've been really robust lately. I've corrected the copyright headers at the top of the files, and I think dev-support/compat is a good place for these tests to live - but I'm open to other ideas as well. I've also added to the README - now that the scripts spin up the clusters on Docker, it's *really* easy to run these. The Python tests are all still working, but they did not seem to catch the previous incompatibility that prevented older clients from writing to newer DataNodes. There's also still a few TODOs or thing that don't work and it's not clear why. So definitely more work to be done, but there's value in the existing CLI compatibility tests. I'd like to get this put in the codebase and get some Jenkins jobs running on it soon. was (Author: mackrorysd): So Docker support has been added for the rolling-upgrade and pull-over-http test. They're using the same Docker image as Yetus builds, etc. And they've been really robust lately. I've corrected the copyright headers at the top of the files, and I think dev-support/compat is a good place for these tests to live - but I'm open to other ideas as well. The Python tests are all still working, but they did not seem to catch the previous incompatibility that prevented older clients from writing to newer DataNodes. There's also still a few TODOs or thing that don't work and it's not clear why. So definitely more work to be done, but there's value in the existing CLI compatibility tests. I'd like to get this put in the codebase and get some Jenkins jobs running on it soon. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-11096: - Status: Patch Available (was: Open) > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-11096: - Attachment: HDFS-11096.001.patch So Docker support has been added for the rolling-upgrade and pull-over-http test. They're using the same Docker image as Yetus builds, etc. And they've been really robust lately. I've corrected the copyright headers at the top of the files, and I think dev-support/compat is a good place for these tests to live - but I'm open to other ideas as well. The Python tests are all still working, but they did not seem to catch the previous incompatibility that prevented older clients from writing to newer DataNodes. There's also still a few TODOs or thing that don't work and it's not clear why. So definitely more work to be done, but there's value in the existing CLI compatibility tests. I'd like to get this put in the codebase and get some Jenkins jobs running on it soon. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > Attachments: HDFS-11096.001.patch > > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118936#comment-16118936 ] Sean Mackrory commented on HDFS-11096: -- Just pushed some more updates. HDFS-12151 is fixed, and I'm once again able to do successful rolling upgrades, even with large delays during the upgrade. I made a quick attempt at doing a rolling upgrade of YARN, but the commands I thought one was supposed to use to stop / start daemons aren't working. The script is there, just not currently called from rolling-upgrade.sh. Not sure if you want to work off of that for YARN's side of things [~rchiang]? I'm also working on adding Docker support and hopefully using the same Docker images we use for precommit jobs, etc. to make the build environment easier to put in place and make this more easily verifiable by others. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Sean Mackrory >Priority: Blocker > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-9806) Allow HDFS block replicas to be provided by an external storage system
[ https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118654#comment-16118654 ] Sean Mackrory edited comment on HDFS-9806 at 8/8/17 5:14 PM: - Could someone more familiar with the design here comment on any anticipated impact to NameNode scalability? If this feature is included but a user chooses not to use it (i.e. there are no storages of type PROVIDED but the capability is there) - is there any impact to memory consumed by the NameNode? I've read the design doc and some of the patches - I *think* things are good but wanting to be sure by someone more familiar with the implementation and who might have even tested that already... was (Author: mackrorysd): Could someone more familiar with the design here comment on any anticipated impact to NameNode scalability? If this feature is included but a user chooses not to use it (i.e. there are no storages of type PROVIDED) - is there any impact to memory consumed by the NameNode? I've read the design doc and some of the patches - I *think* things are good but wanting to be sure by someone more familiar with the implementation and who might have even tested that already... > Allow HDFS block replicas to be provided by an external storage system > -- > > Key: HDFS-9806 > URL: https://issues.apache.org/jira/browse/HDFS-9806 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Douglas > Attachments: HDFS-9806-design.001.pdf, HDFS-9806-design.002.pdf > > > In addition to heterogeneous media, many applications work with heterogeneous > storage systems. The guarantees and semantics provided by these systems are > often similar, but not identical to those of > [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html]. > Any client accessing multiple storage systems is responsible for reasoning > about each system independently, and must propagate/and renew credentials for > each store. > Remote stores could be mounted under HDFS. Block locations could be mapped to > immutable file regions, opaque IDs, or other tokens that represent a > consistent view of the data. While correctness for arbitrary operations > requires careful coordination between stores, in practice we can provide > workable semantics with weaker guarantees. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9806) Allow HDFS block replicas to be provided by an external storage system
[ https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118654#comment-16118654 ] Sean Mackrory commented on HDFS-9806: - Could someone more familiar with the design here comment on any anticipated impact to NameNode scalability? If this feature is included but a user chooses not to use it (i.e. there are no storages of type PROVIDED) - is there any impact to memory consumed by the NameNode? I've read the design doc and some of the patches - I *think* things are good but wanting to be sure by someone more familiar with the implementation and who might have even tested that already... > Allow HDFS block replicas to be provided by an external storage system > -- > > Key: HDFS-9806 > URL: https://issues.apache.org/jira/browse/HDFS-9806 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Douglas > Attachments: HDFS-9806-design.001.pdf, HDFS-9806-design.002.pdf > > > In addition to heterogeneous media, many applications work with heterogeneous > storage systems. The guarantees and semantics provided by these systems are > often similar, but not identical to those of > [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html]. > Any client accessing multiple storage systems is responsible for reasoning > about each system independently, and must propagate/and renew credentials for > each store. > Remote stores could be mounted under HDFS. Block locations could be mapped to > immutable file regions, opaque IDs, or other tokens that represent a > consistent view of the data. While correctness for arbitrary operations > requires careful coordination between stores, in practice we can provide > workable semantics with weaker guarantees. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-12151: - Resolution: Fixed Fix Version/s: 3.0.0-beta1 Status: Resolved (was: Patch Available) > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Affects Versions: 3.0.0-alpha4 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Fix For: 3.0.0-beta1 > > Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, > HDFS-12151.003.patch, HDFS-12151.004.patch, HDFS-12151.005.patch, > HDFS-12151.006.patch, HDFS-12151.007.patch > > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109715#comment-16109715 ] Sean Mackrory commented on HDFS-12151: -- Pushed, as [~andrew.wang]'s "pendings" have been resolved. Resolving... Thank you for the reviews! > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Affects Versions: 3.0.0-alpha4 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, > HDFS-12151.003.patch, HDFS-12151.004.patch, HDFS-12151.005.patch, > HDFS-12151.006.patch, HDFS-12151.007.patch > > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108308#comment-16108308 ] Sean Mackrory commented on HDFS-12151: -- Ah sorry about that - I seem to be blind to the yellow checkstyle warnings... I did confirm that the test failures are flaky. They succeed locally and timeouts also occurred in the same class (often the same function) in several recent runs of the Pre-Commit jobs. Of course I just closed the editor where I noted all the URLs to said jobs, but they're there and they're recent, I promise :) Thanks for the review! Attaching a final patch with the checkstyle issues addressed. Reran the new test and the 2 that failed last time locally, and had a clean Yetus run. > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Affects Versions: 3.0.0-alpha4 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, > HDFS-12151.003.patch, HDFS-12151.004.patch, HDFS-12151.005.patch, > HDFS-12151.006.patch, HDFS-12151.007.patch > > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-12151: - Attachment: HDFS-12151.007.patch > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Affects Versions: 3.0.0-alpha4 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, > HDFS-12151.003.patch, HDFS-12151.004.patch, HDFS-12151.005.patch, > HDFS-12151.006.patch, HDFS-12151.007.patch > > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-12151: - Attachment: HDFS-12151.006.patch So the problem was that it couldn't connect to a socket on the local port I was telling it to. I had originally had a dummy server as part of the NullDataNode class, but I later found it to be unnecessary, assuming that it was because it was only using the OutputStream I was passing in. The reason that worked is because I *happen* to have something listening on port 12345 locally. So I've restored the dummy server, and I'm now using a different port that I don't have anything listening on locally as well as intelligently switching the URLs and the dummy server to a different port if there ever is anything already on that port. Attaching for a more serious test run... > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Affects Versions: 3.0.0-alpha4 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, > HDFS-12151.003.patch, HDFS-12151.004.patch, HDFS-12151.005.patch, > HDFS-12151.006.patch > > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-12151: - Attachment: HDFS-12151.005.patch > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Affects Versions: 3.0.0-alpha4 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, > HDFS-12151.003.patch, HDFS-12151.004.patch, HDFS-12151.005.patch > > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-12151: - Attachment: HDFS-12151.004.patch I apologize for the noise everyone, but since I can't reproduce the test failure locally the next couple of patches are experimental and at least this one will still fail. There's another layer of swallowed exceptions before it's getting as far as the write and I need to see where it's getting thrown... > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Affects Versions: 3.0.0-alpha4 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, > HDFS-12151.003.patch, HDFS-12151.004.patch > > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103867#comment-16103867 ] Sean Mackrory commented on HDFS-11096: -- Updating: it was a recent change that broke this, I've posted a patch to fix it that's being reviewed / iterated on, and I've updated my rolling upgrade test scripts to actually confirm via the Job History Server that the jobs themselves were FINISHED and SUCCESSFUL. I re-ran the test with an early patch and I was able to get a successful rolling upgrade with 5-10 minute delays between each step. So the entire rolling upgrade of a 9-node (6 worker-node) cluster was spread out over 4 hours and I didn't encounter any other issues, EXCEPT: in my test workload, I had to increase Terasort's output replication, because some job failures were occasionally happening when a job wrote to a node that was about to be taken down for upgrades. I fixed that and no other actual compatibility issues in Hadoop were found. I'll push the fixes out to Github soon... > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Lei (Eddy) Xu >Priority: Blocker > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-12151: - Attachment: HDFS-12151.003.patch Attaching a patch with the checkstyle issues fixed and also logging a stack trace for exceptions that happen earlier than required. I tried running parallel tests locally and didn't have a problem, but many other tests are failing because they think LOG fields are missing (but in the code, they're not - working on it). Also had a clean Yetus run locally, so I may be missing some config or something. I don't want to handle RuntimeExceptions differently because it's NPE that we receive in the case of the bug I'm fixing, and it's an NPE that we receive after data has been sent to the server because I haven't mocked everything. So if we receive an NPE before data is sent to the server, I'd like to treat it the same as any other exception and fail if it's too early. > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Affects Versions: 3.0.0-alpha4 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch, > HDFS-12151.003.patch > > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-12151: - Attachment: HDFS-12151.002.patch Attaching a test that ensures you can pass in null-ish values (at least the ones you get when you use a Hadoop 2 client) and get at least far enough to start writing things to the server. I just assume after that point you'll get an exception, because there's already a pretty ridiculous level of Mocking going on, and it would take so much more to get a complete writeBlocks() run. So we just abort as soon as there's an exception, and consider it successful if we got far enough that data had actually been sent over the network. The test fails without my most recent change and succeeds with it, and that should be true of HDFS-11956 too. Patch 002 does not include any of the other changes being discussed (yet). > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Affects Versions: 3.0.0-alpha4 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Attachments: HDFS-12151.001.patch, HDFS-12151.002.patch > > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-12151: - Affects Version/s: 3.0.0-alpha4 Target Version/s: 3.0.0-beta1 Status: Patch Available (was: Open) > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Affects Versions: 3.0.0-alpha4 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Attachments: HDFS-12151.001.patch > > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-12151: - Attachment: HDFS-12151.001.patch Attaching trivial patch. Yetus complains that this doesn't include new tests, but this is caught by the rolling upgrade test I've been working on under HDFS-11096. > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Attachments: HDFS-12151.001.patch > > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090677#comment-16090677 ] Sean Mackrory commented on HDFS-12151: -- The problem is in lines added by HDFS-9807. When serving clients from before that feature, no storage Ids will be provided, but we unconditionally address the first element in the array. I was able to do a successful rolling upgrade from HDFS 2 -> HDFS 3 by just checking the length first, and passing in null by default. > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Reporter: Sean Mackrory >Assignee: Sean Mackrory > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
[ https://issues.apache.org/jira/browse/HDFS-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-12151: - Description: Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently fails. On the client side it looks like this: {code} 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} But on the DataNode side there's an ArrayOutOfBoundsException because there aren't any targetStorageIds: {code} java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) at java.lang.Thread.run(Thread.java:745){code} was: Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently fails. On the client side it looks like this: {code} 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} But on the DataNode side there's an ArrayOutOfBoundsException because there aren't any targetStorageTypes: {code} java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) at java.lang.Thread.run(Thread.java:745){code} > Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes > > > Key: HDFS-12151 > URL: https://issues.apache.org/jira/browse/HDFS-12151 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rolling upgrades >Reporter: Sean Mackrory >Assignee: Sean Mackrory > > Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently > fails. On the client side it looks like this: > {code} > 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in > createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} > But on the DataNode side there's an ArrayOutOfBoundsException because there > aren't any targetStorageIds: > {code} > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089926#comment-16089926 ] Sean Mackrory commented on HDFS-11096: -- Filed HDFS-12151 after looking into the logs deeper than I have in a while and seeing that stuff is failing once you start rolling the DataNodes. In a nutshell, Hadoop 2 clients can't write to Hadoop 3 DataNodes, so everything is falling apart at that point. I've been making the mistake of assuming all-processes-exit-0 => everything-is-working-great, but if jobs fails completely the CLI still returns 0. I do believe I've checked for actual success before, so I believe this did used to work and broke fairly recently, but I'm going to dig deeper. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Lei (Eddy) Xu >Priority: Blocker > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12151) Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes
Sean Mackrory created HDFS-12151: Summary: Hadoop 2 clients cannot writeBlock to Hadoop 3 DataNodes Key: HDFS-12151 URL: https://issues.apache.org/jira/browse/HDFS-12151 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Sean Mackrory Assignee: Sean Mackrory Trying to write to a Hadoop 3 DataNode with a Hadoop 2 client currently fails. On the client side it looks like this: {code} 17/07/14 13:31:58 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449){code} But on the DataNode side there's an ArrayOutOfBoundsException because there aren't any targetStorageTypes: {code} java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:815) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) at java.lang.Thread.run(Thread.java:745){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081348#comment-16081348 ] Sean Mackrory commented on HDFS-11096: -- That's a good idea, [~rchiang]. Adding in some delays between each step should be trivial and would dramatically increase the surface area of the test. Also, regarding the YARN-3583 issue, it was fortunately addressed by YARN-6143. I have marked the JIRAs as related. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Lei (Eddy) Xu >Priority: Blocker > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080798#comment-16080798 ] Sean Mackrory commented on HDFS-11096: -- For anyone following this thread, I've come back to this and pushed some updates to the tests at https://github.com/mackrorysd/hadoop-compatibility. * fixes for some idempotence / SSH automation problems that could've popped up before in the rolling upgrade test, and actually validating the sorted data. * [~eddyxu] wrote a little framework for writing tests against mini HDFS clusters of 2 different versions in Python, and a test that you can cp between 2 clusters. I did a bit of refactoring and added tests that check for similar output for most of the "hdfs dfs" commands currently in Hadoop 2. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Assignee: Lei (Eddy) Xu >Priority: Blocker > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby
[ https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-10702: - Status: Open (was: Patch Available) > Add a Client API and Proxy Provider to enable stale read from Standby > - > > Key: HDFS-10702 > URL: https://issues.apache.org/jira/browse/HDFS-10702 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jiayi Zhou >Assignee: Sean Mackrory >Priority: Minor > Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, > HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, > HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, > StaleReadfromStandbyNN.pdf > > > Currently, clients must always talk to the active NameNode when performing > any metadata operation, which means active NameNode could be a bottleneck for > scalability. One way to solve this problem is to send read-only operations to > Standby NameNode. The disadvantage is that it might be a stale read. > Here, I'm thinking of adding a Client API to enable/disable stale read from > Standby which gives Client the power to set the staleness restriction. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby
[ https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-10702: - Attachment: HDFS-10702.008.patch Rebasing patch on more recent changes. I had also discovered that one of the tests was failing - if you disable stale reads client-side and a fail-over happens before a write-operation, you still might hit a standby namenode and the server might still return a response. I did a quick fix by also setting the minimum txId to Long.MAX_VALUE. Not sure that's how I want to fix it: it means anyone would have to reset the txId when enabling (although in practice I doubt that's a problem, as long as they know they have to, and I doubt that will be a common use case). I'm going to look at how best to address the potential getBlockLocations issue as well. > Add a Client API and Proxy Provider to enable stale read from Standby > - > > Key: HDFS-10702 > URL: https://issues.apache.org/jira/browse/HDFS-10702 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jiayi Zhou >Assignee: Sean Mackrory >Priority: Minor > Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, > HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, > HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, > StaleReadfromStandbyNN.pdf > > > Currently, clients must always talk to the active NameNode when performing > any metadata operation, which means active NameNode could be a bottleneck for > scalability. One way to solve this problem is to send read-only operations to > Standby NameNode. The disadvantage is that it might be a stale read. > Here, I'm thinking of adding a Client API to enable/disable stale read from > Standby which gives Client the power to set the staleness restriction. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-10702) Add a Client API and Proxy Provider to enable stale read from Standby
[ https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory reassigned HDFS-10702: Assignee: Sean Mackrory (was: Jiayi Zhou) > Add a Client API and Proxy Provider to enable stale read from Standby > - > > Key: HDFS-10702 > URL: https://issues.apache.org/jira/browse/HDFS-10702 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jiayi Zhou >Assignee: Sean Mackrory >Priority: Minor > Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, > HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, > HDFS-10702.006.patch, HDFS-10702.007.patch, StaleReadfromStandbyNN.pdf > > > Currently, clients must always talk to the active NameNode when performing > any metadata operation, which means active NameNode could be a bottleneck for > scalability. One way to solve this problem is to send read-only operations to > Standby NameNode. The disadvantage is that it might be a stale read. > Here, I'm thinking of adding a Client API to enable/disable stale read from > Standby which gives Client the power to set the staleness restriction. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11661) GetContentSummary uses excessive amounts of memory
[ https://issues.apache.org/jira/browse/HDFS-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979551#comment-15979551 ] Sean Mackrory commented on HDFS-11661: -- +1 to the revert - I too would still like to see the original problem fixed, but this is worse. It does indeed require global context to do correctly, so it'll require some cleverness to make sure we do that without using tons of space or locking for a long time. [~jojochuang] - to revert cleanly we can revert HDFS-11515 (unless I'm missing something and that patch does more than just correct the original changes in HDFS-10797) first and then HDFS-10797. As [~xiaochen] is not available right now, would you be able to commit the revert when we're satisfied? I'll run tests with the reverts committed locally... > GetContentSummary uses excessive amounts of memory > -- > > Key: HDFS-11661 > URL: https://issues.apache.org/jira/browse/HDFS-11661 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.8.0, 3.0.0-alpha2 >Reporter: Nathan Roberts >Priority: Blocker > Attachments: Heap growth.png > > > ContentSummaryComputationContext::nodeIncluded() is being used to keep track > of all INodes visited during the current content summary calculation. This > can be all of the INodes in the filesystem, making for a VERY large hash > table. This simply won't work on large filesystems. > We noticed this after upgrading a namenode with ~100Million filesystem > objects was spending significantly more time in GC. Fortunately this system > had some memory breathing room, other clusters we have will not run with this > additional demand on memory. > This was added as part of HDFS-10797 as a way of keeping track of INodes that > have already been accounted for - to avoid double counting. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15816505#comment-15816505 ] Sean Mackrory commented on HDFS-11096: -- The getHdfsBlockLocations removal is documented in HDFS-8895 - apparently that had been deprecated. I filed HDFS-11312 and posted a patch for the nonDfsUsed discrepancy. > Support rolling upgrade between 2.x and 3.x > --- > > Key: HDFS-11096 > URL: https://issues.apache.org/jira/browse/HDFS-11096 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rolling upgrades >Affects Versions: 3.0.0-alpha1 >Reporter: Andrew Wang >Priority: Blocker > > trunk has a minimum software version of 3.0.0-alpha1. This means we can't > rolling upgrade between branch-2 and trunk. > This is a showstopper for large deployments. Unless there are very compelling > reasons to break compatibility, let's restore the ability to rolling upgrade > to 3.x releases. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11312) Discrepancy in nonDfsUsed index in protobuf
[ https://issues.apache.org/jira/browse/HDFS-11312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-11312: - Attachment: HDFS-11312.001.patch > Discrepancy in nonDfsUsed index in protobuf > --- > > Key: HDFS-11312 > URL: https://issues.apache.org/jira/browse/HDFS-11312 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Sean Mackrory >Assignee: Sean Mackrory >Priority: Minor > Attachments: HDFS-11312.001.patch > > > The patches for HDFS-9038 had a discrepancy between trunk and branch-2.7: in > one message type, nonDfsUsed is given 2 different indices. This is a minor > wire incompatibility that is easy to fix... -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11312) Discrepancy in nonDfsUsed index in protobuf
Sean Mackrory created HDFS-11312: Summary: Discrepancy in nonDfsUsed index in protobuf Key: HDFS-11312 URL: https://issues.apache.org/jira/browse/HDFS-11312 Project: Hadoop HDFS Issue Type: Bug Reporter: Sean Mackrory Assignee: Sean Mackrory Priority: Minor The patches for HDFS-9038 had a discrepancy between trunk and branch-2.7: in one message type, nonDfsUsed is given 2 different indices. This is a minor wire incompatibility that is easy to fix... -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org