zhihai xu created HADOOP-12443: ---------------------------------- Summary: LocalDirAllocator shouldn't accept pathStr parameter with scheme or authority. Key: HADOOP-12443 URL: https://issues.apache.org/jira/browse/HADOOP-12443 Project: Hadoop Common Issue Type: Improvement Components: fs Reporter: zhihai xu Assignee: zhihai xu
{{LocalDirAllocator}} shouldn't accept {{pathStr}} parameter with scheme or authority. Currently {{LocalDirAllocator}} accepts {{pathStr}} with scheme or authority, When {{pathStr}} with scheme or authority is passed to {{getLocalPathForWrite}}, it will bypass {{localDirs}} to use {{pathStr}} directly , then the return Path will be independent with {{localDirs}}. The reason is the following: {{LocalDirAllocator}} will use {{new Path(new Path(localDirs[dirNumLastAccessed]), pathStr)}} as the return Path. The constructor code for {{Path}} is {code} public Path(Path parent, Path child) { // Add a slash to parent's path so resolution is compatible with URI's URI parentUri = parent.uri; String parentPath = parentUri.getPath(); if (!(parentPath.equals("/") || parentPath.isEmpty())) { try { parentUri = new URI(parentUri.getScheme(), parentUri.getAuthority(), parentUri.getPath()+"/", null, parentUri.getFragment()); } catch (URISyntaxException e) { throw new IllegalArgumentException(e); } } URI resolved = parentUri.resolve(child.uri); initialize(resolved.getScheme(), resolved.getAuthority(), resolved.getPath(), resolved.getFragment()); } {code} The above {{Path}} constructor code will call {{URI#resolve}} to merge the parent path with child path. {code} private static URI resolve(URI base, URI child) { // check if child if opaque first so that NPE is thrown // if child is null. if (child.isOpaque() || base.isOpaque()) return child; // 5.2 (2): Reference to current document (lone fragment) if ((child.scheme == null) && (child.authority == null) && child.path.equals("") && (child.fragment != null) && (child.query == null)) { if ((base.fragment != null) && child.fragment.equals(base.fragment)) { return base; } URI ru = new URI(); ru.scheme = base.scheme; ru.authority = base.authority; ru.userInfo = base.userInfo; ru.host = base.host; ru.port = base.port; ru.path = base.path; ru.fragment = child.fragment; ru.query = base.query; return ru; } // 5.2 (3): Child is absolute if (child.scheme != null) return child; URI ru = new URI(); // Resolved URI ru.scheme = base.scheme; ru.query = child.query; ru.fragment = child.fragment; // 5.2 (4): Authority if (child.authority == null) { ru.authority = base.authority; ru.host = base.host; ru.userInfo = base.userInfo; ru.port = base.port; String cp = (child.path == null) ? "" : child.path; if ((cp.length() > 0) && (cp.charAt(0) == '/')) { // 5.2 (5): Child path is absolute ru.path = child.path; } else { // 5.2 (6): Resolve relative path ru.path = resolvePath(base.path, cp, base.isAbsolute()); } } else { ru.authority = child.authority; ru.host = child.host; ru.userInfo = child.userInfo; ru.host = child.host; ru.port = child.port; ru.path = child.path; } // 5.2 (7): Recombine (nothing to do here) return ru; } {code} You can see if the child's uri has scheme or authority, it won't use anything from parent's uri. This will hide the issue for user. For example, user passed file:///build/test/temp as {{pathStr}} parameter to {{getLocalPathForWrite}}. Later on user may run into very strange problem: /build/test/temp directory is full because return path is not from {{localDirs}}. This makes the issue very difficult for user to debug. So it will be better to reject {{pathStr}} parameter with scheme or authority. -- This message was sent by Atlassian JIRA (v6.3.4#6332)