[
https://issues.apache.org/jira/browse/DRILL-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052475#comment-16052475
]
ASF GitHub Bot commented on DRILL-3867:
---------------------------------------
Github user vdiravka commented on a diff in the pull request:
https://github.com/apache/drill/pull/824#discussion_r122511871
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java
---
@@ -526,6 +534,48 @@ private void writeFile(ParquetTableMetadataDirs
parquetTableMetadataDirs, Path p
}
/**
+ * Serializer for ParquetPath. Writes the path relative to the root path
+ */
+ private static class ParquetPathSerializer extends
StdSerializer<ParquetPath> {
+ private final String rootPath;
+
+ ParquetPathSerializer(String rootPath) {
+ super(ParquetPath.class);
+ this.rootPath = rootPath;
+ }
+
+ @Override
+ public void serialize(ParquetPath parquetPath, JsonGenerator
jsonGenerator, SerializerProvider serializerProvider) throws IOException,
JsonGenerationException {
+
Preconditions.checkState(parquetPath.getFullPath().startsWith(rootPath),
String.format("Path %s is not a subpath of %s", parquetPath.getFullPath(),
rootPath));
+ String relativePath =
parquetPath.getFullPath().replaceFirst(rootPath, "");
--- End diff --
Hadoop Path doesn't provide similar way. But it is possible to use
relativize() method from `Uri`.
Anyway in the new approach in the `Metadata.createMetaFilesRecursively()`
I've implemented recursive collecting of inner subdirectories's names to
construct relative path for every file and directory.
> Store relative paths in metadata file
> -------------------------------------
>
> Key: DRILL-3867
> URL: https://issues.apache.org/jira/browse/DRILL-3867
> Project: Apache Drill
> Issue Type: Bug
> Components: Metadata
> Affects Versions: 1.2.0
> Reporter: Rahul Challapalli
> Assignee: Vitalii Diravka
> Fix For: Future
>
>
> git.commit.id.abbrev=cf4f745
> git.commit.time=29.09.2015 @ 23\:19\:52 UTC
> The below sequence of steps reproduces the issue
> 1. Create the cache file
> {code}
> 0: jdbc:drill:zk=10.10.103.60:5181> refresh table metadata
> dfs.`/drill/testdata/metadata_caching/lineitem`;
> +-------+-------------------------------------------------------------------------------------+
> | ok | summary
> |
> +-------+-------------------------------------------------------------------------------------+
> | true | Successfully updated metadata for table
> /drill/testdata/metadata_caching/lineitem. |
> +-------+-------------------------------------------------------------------------------------+
> 1 row selected (1.558 seconds)
> {code}
> 2. Move the directory
> {code}
> hadoop fs -mv /drill/testdata/metadata_caching/lineitem /drill/
> {code}
> 3. Now run a query on top of it
> {code}
> 0: jdbc:drill:zk=10.10.103.60:5181> select * from dfs.`/drill/lineitem` limit
> 1;
> Error: SYSTEM ERROR: FileNotFoundException: Requested file
> maprfs:///drill/testdata/metadata_caching/lineitem/2006/1 does not exist.
> [Error Id: b456d912-57a0-4690-a44b-140d4964903e on pssc-66.qa.lab:31010]
> (state=,code=0)
> {code}
> This is obvious given the fact that we are storing absolute file paths in the
> cache file
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)