[jira] [Commented] (HDFS-8234) DistributedFileSystem and Globber should apply PathFilter early
[ https://issues.apache.org/jira/browse/HDFS-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511306#comment-14511306 ] Rohini Palaniswamy commented on HDFS-8234: -- I am not working on it. Please go ahead. DistributedFileSystem and Globber should apply PathFilter early --- Key: HDFS-8234 URL: https://issues.apache.org/jira/browse/HDFS-8234 Project: Hadoop HDFS Issue Type: Improvement Reporter: Rohini Palaniswamy Assignee: J.Andreina Labels: newbie HDFS-985 added partial listing in listStatus to avoid listing entries of large directory in one go. If listStatus(Path p, PathFilter f) call is made, filter is applied after fetching all the entries resulting in a big list being constructed on the client side. If the DistributedFileSystem.listStatusInternal() applied the PathFilter it would be more efficient. So DistributedFileSystem should override listStatus(Path f, PathFilter filter) and apply PathFilter early. Globber.java also applies filter after calling listStatus. It should call listStatus with the PathFilter. {code} FileStatus[] children = listStatus(candidate.getPath()); . for (FileStatus child : children) { // Set the child path based on the parent path. child.setPath(new Path(candidate.getPath(), child.getPath().getName())); if (globFilter.accept(child.getPath())) { newCandidates.add(child); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8234) DistributedFileSystem and Globber should apply PathFilter early
[ https://issues.apache.org/jira/browse/HDFS-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated HDFS-8234: - Description: HDFS-985 added partial listing in listStatus to avoid listing entries of large directory in one go. If listStatus(Path p, PathFilter f) call is made, filter is applied after fetching all the entries resulting in a big list being constructed on the client side. If the DistributedFileSystem.listStatusInternal() applied the PathFilter it would be more efficient. So DistributedFileSystem should override listStatus(Path f, PathFilter filter) and apply PathFilter early. Globber.java also applies filter after calling listStatus. It should call listStatus with the PathFilter. {code} FileStatus[] children = listStatus(candidate.getPath()); . for (FileStatus child : children) { // Set the child path based on the parent path. child.setPath(new Path(candidate.getPath(), child.getPath().getName())); if (globFilter.accept(child.getPath())) { newCandidates.add(child); } } {code} was:HDFS-985 added partial listing in listStatus to avoid listing entries of large directory in one go. If listStatus(Path p, PathFilter f) call is made, filter is applied after fetching all the entries resulting in a big list being constructed on the client side. If the DistributedFileSystem.listStatusInternal() applied the PathFilter it would be more efficient. Summary: DistributedFileSystem and Globber should apply PathFilter early (was: DistributedFileSystem should override listStatus(Path f, PathFilter filter) ) DistributedFileSystem and Globber should apply PathFilter early --- Key: HDFS-8234 URL: https://issues.apache.org/jira/browse/HDFS-8234 Project: Hadoop HDFS Issue Type: Improvement Reporter: Rohini Palaniswamy Labels: newbie HDFS-985 added partial listing in listStatus to avoid listing entries of large directory in one go. If listStatus(Path p, PathFilter f) call is made, filter is applied after fetching all the entries resulting in a big list being constructed on the client side. If the DistributedFileSystem.listStatusInternal() applied the PathFilter it would be more efficient. So DistributedFileSystem should override listStatus(Path f, PathFilter filter) and apply PathFilter early. Globber.java also applies filter after calling listStatus. It should call listStatus with the PathFilter. {code} FileStatus[] children = listStatus(candidate.getPath()); . for (FileStatus child : children) { // Set the child path based on the parent path. child.setPath(new Path(candidate.getPath(), child.getPath().getName())); if (globFilter.accept(child.getPath())) { newCandidates.add(child); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8234) DistributedFileSystem should override listStatus(Path f, PathFilter filter)
Rohini Palaniswamy created HDFS-8234: Summary: DistributedFileSystem should override listStatus(Path f, PathFilter filter) Key: HDFS-8234 URL: https://issues.apache.org/jira/browse/HDFS-8234 Project: Hadoop HDFS Issue Type: Improvement Reporter: Rohini Palaniswamy HDFS-985 added partial listing in listStatus to avoid listing entries of large directory in one go. If listStatus(Path p, PathFilter f) call is made, filter is applied after fetching all the entries resulting in a big list being constructed on the client side. If the DistributedFileSystem.listStatusInternal() applied the PathFilter it would be more efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8234) DistributedFileSystem should override listStatus(Path f, PathFilter filter)
[ https://issues.apache.org/jira/browse/HDFS-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated HDFS-8234: - Labels: newbie (was: ) DistributedFileSystem should override listStatus(Path f, PathFilter filter) Key: HDFS-8234 URL: https://issues.apache.org/jira/browse/HDFS-8234 Project: Hadoop HDFS Issue Type: Improvement Reporter: Rohini Palaniswamy Labels: newbie HDFS-985 added partial listing in listStatus to avoid listing entries of large directory in one go. If listStatus(Path p, PathFilter f) call is made, filter is applied after fetching all the entries resulting in a big list being constructed on the client side. If the DistributedFileSystem.listStatusInternal() applied the PathFilter it would be more efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations
[ https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13887245#comment-13887245 ] Rohini Palaniswamy commented on HDFS-4564: -- This broke Oozie. AuthenticationFilter.java in hadoop-auth uses a different Signer on every restart. Old auth tokens used to be rejected (SignerException) with 401 and Negotiate, making the client to authenticate with Kerberos. Now old auth tokens are rejected with 403 causing client to fail. Webhdfs returns incorrect http response codes for denied operations --- Key: HDFS-4564 URL: https://issues.apache.org/jira/browse/HDFS-4564 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.patch Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's denying operations. Examples including rejecting invalid proxy user attempts and renew/cancel with an invalid user. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] Commented: (HDFS-1317) HDFSProxy needs additional changes to work after changes to streamFile servlet in HDFS-1109
[ https://issues.apache.org/jira/browse/HDFS-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12893727#action_12893727 ] Rohini Palaniswamy commented on HDFS-1317: -- New test was not added for this patch, as changes were made to the Servlet filter. Writing a unit test involved stubbing out HttpServletRequest and other servlet-api classes which is not worth the effort for the minor change of changing a regex pattern to be in sync with the changed contract. Manually verified by deploying hdfsproxy on tomcat and tested that streamFile works. HDFSProxy needs additional changes to work after changes to streamFile servlet in HDFS-1109 --- Key: HDFS-1317 URL: https://issues.apache.org/jira/browse/HDFS-1317 Project: Hadoop HDFS Issue Type: Bug Components: contrib/hdfsproxy Affects Versions: 0.22.0 Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.22.0 Attachments: HDFS-1317-trunk.patch, HDFS-1317.patch Before HDFS-1109, streamFile had filename passed as a parameter in the request. With HDFS-1109, the filename is part of the request path similar to listPaths and data servlets. The AuthorizationFilter in HdfsProxy needs updating to pick up the path from the request path instead of looking for filename parameter. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1317) HDFSProxy needs additional changes to work after changes to streamFile servlet in HDFS-1109
[ https://issues.apache.org/jira/browse/HDFS-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12893144#action_12893144 ] Rohini Palaniswamy commented on HDFS-1317: -- All the failed tests are not caused by this patch. The related JIRA are listed below. TestBlockToken https://issues.apache.org/jira/browse/HDFS-1284 TestFileAppend4.testRecoverFinalizedBlock https://issues.apache.org/jira/browse/HDFS-1306 TestFileAppend4.testCompleteOtherLeaseHoldersFile https://issues.apache.org/jira/browse/HDFS-1306 TestHdfsProxy https://issues.apache.org/jira/browse/HDFS-1301 https://issues.apache.org/jira/browse/HDFS-1301 was not committed to trunk and it needs to be done. But even with that patch TestHdfsProxy is failing for me with unauthorized for user x via x. Investigating it. HDFSProxy needs additional changes to work after changes to streamFile servlet in HDFS-1109 --- Key: HDFS-1317 URL: https://issues.apache.org/jira/browse/HDFS-1317 Project: Hadoop HDFS Issue Type: Bug Components: contrib/hdfsproxy Affects Versions: 0.22.0 Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.22.0 Attachments: HDFS-1317-trunk.patch, HDFS-1317.patch Before HDFS-1109, streamFile had filename passed as a parameter in the request. With HDFS-1109, the filename is part of the request path similar to listPaths and data servlets. The AuthorizationFilter in HdfsProxy needs updating to pick up the path from the request path instead of looking for filename parameter. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1317) HDFSProxy needs additional changes to work after changes to streamFile servlet in HDFS-1109
[ https://issues.apache.org/jira/browse/HDFS-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated HDFS-1317: - Attachment: HDFS-1317-trunk.patch Patch for apache trunk. HDFSProxy needs additional changes to work after changes to streamFile servlet in HDFS-1109 --- Key: HDFS-1317 URL: https://issues.apache.org/jira/browse/HDFS-1317 Project: Hadoop HDFS Issue Type: Bug Components: contrib/hdfsproxy Affects Versions: 0.21.0 Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.22.0 Attachments: HDFS-1317-trunk.patch, HDFS-1317.patch Before HDFS-1109, streamFile had filename passed as a parameter in the request. With HDFS-1109, the filename is part of the request path similar to listPaths and data servlets. The AuthorizationFilter in HdfsProxy needs updating to pick up the path from the request path instead of looking for filename parameter. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-1317) HDFSProxy needs additional changes to work after changes to streamFile servlet in HDFS-1109
HDFSProxy needs additional changes to work after changes to streamFile servlet in HDFS-1109 --- Key: HDFS-1317 URL: https://issues.apache.org/jira/browse/HDFS-1317 Project: Hadoop HDFS Issue Type: Bug Components: contrib/hdfsproxy Affects Versions: 0.21.0 Reporter: Rohini Palaniswamy Fix For: 0.21.0 Before HDFS-1109, streamFile had filename passed as a parameter in the request. With HDFS-1109, the filename is part of the request path similar to listPaths and data servlets. The AuthorizationFilter in HdfsProxy needs updating to pick up the path from the request path instead of looking for filename parameter. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1109) HFTP and URL Encoding
[ https://issues.apache.org/jira/browse/HDFS-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891504#action_12891504 ] Rohini Palaniswamy commented on HDFS-1109: -- Thanks. Created issue https://issues.apache.org/jira/browse/HDFS-1317 for the fix in HDFSProxy. HFTP and URL Encoding - Key: HDFS-1109 URL: https://issues.apache.org/jira/browse/HDFS-1109 Project: Hadoop HDFS Issue Type: Bug Components: contrib/hdfsproxy, data-node Affects Versions: 0.20.1, 0.20.2, 0.20.3, 0.21.0, 0.22.0 Reporter: Dmytro Molkov Assignee: Dmytro Molkov Fix For: 0.22.0 Attachments: HDFS-1109.2.patch, HDFS-1109.2_y0.20.1xx.patch, HDFS-1109.2_y0.20.1xx_incremental.patch, HDFS-1109.patch We just saw this error happen in our cluster. If there is a file that has a + sign in the name it is not readable through HFTP protocol. The problem is when we are reading a file with HFTP we are passing a name of the file as a parameter in request and + gets undecoded into space on the server side. So the datanode receiving the streamFile request tries to access a file with space instead of + in the name and doesn't find that file. The proposed solution is to pass the filename as a part of URL as with all the other HFTP commands, since this is the only place where it is not being treated this way. Are there any objections to this? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1317) HDFSProxy needs additional changes to work after changes to streamFile servlet in HDFS-1109
[ https://issues.apache.org/jira/browse/HDFS-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated HDFS-1317: - Fix Version/s: 0.22.0 (was: 0.21.0) HDFSProxy needs additional changes to work after changes to streamFile servlet in HDFS-1109 --- Key: HDFS-1317 URL: https://issues.apache.org/jira/browse/HDFS-1317 Project: Hadoop HDFS Issue Type: Bug Components: contrib/hdfsproxy Affects Versions: 0.21.0 Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.22.0 Attachments: HDFS-1317.patch Before HDFS-1109, streamFile had filename passed as a parameter in the request. With HDFS-1109, the filename is part of the request path similar to listPaths and data servlets. The AuthorizationFilter in HdfsProxy needs updating to pick up the path from the request path instead of looking for filename parameter. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-1313) HdfsProxy changes from HDFS-481 missed in y20.1xx
HdfsProxy changes from HDFS-481 missed in y20.1xx - Key: HDFS-1313 URL: https://issues.apache.org/jira/browse/HDFS-1313 Project: Hadoop HDFS Issue Type: Bug Components: contrib/hdfsproxy Affects Versions: 0.21.0 Reporter: Rohini Palaniswamy Fix For: 0.21.0 Some changes went missing between https://issues.apache.org/jira/secure/attachment/12441028/HDFS-481-bp-y20s.patch and https://issues.apache.org/jira/secure/attachment/12442210/HDFS-481-NEW.patch. The missing changes went into y20.100 but not into y20.1xx. This patch fixes it. Without this hdfsproxy does not work. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1313) HdfsProxy changes from HDFS-481 missed in y20.1xx
[ https://issues.apache.org/jira/browse/HDFS-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated HDFS-1313: - Attachment: HDFS-1313.y0.20.1xx.patch patch for y20.1xx HdfsProxy changes from HDFS-481 missed in y20.1xx - Key: HDFS-1313 URL: https://issues.apache.org/jira/browse/HDFS-1313 Project: Hadoop HDFS Issue Type: Bug Components: contrib/hdfsproxy Affects Versions: 0.21.0 Reporter: Rohini Palaniswamy Fix For: 0.21.0 Attachments: HDFS-1313.y0.20.1xx.patch Some changes went missing between https://issues.apache.org/jira/secure/attachment/12441028/HDFS-481-bp-y20s.patch and https://issues.apache.org/jira/secure/attachment/12442210/HDFS-481-NEW.patch. The missing changes went into y20.100 but not into y20.1xx. This patch fixes it. Without this hdfsproxy does not work. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.