[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/658#issuecomment-103819433 I think the PR is good to merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/658#issuecomment-103883348 Looks good. Will merge this in my next batch... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/658#issuecomment-101185563 That is a good point. It is a bit tricky, though, since the skip method does not guarantee efficient skipping. It also reserves the right to skip less (and return the number of bytes skipped, iirc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user sekruse commented on the pull request: https://github.com/apache/flink/pull/658#issuecomment-101215521 I see your point and double-checked it with the Java doc. I will adjust the seek method accordingly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/658#issuecomment-101024456 Looks good. One comment, though: - The position tracking in the stream wrapper is a nice idea, but since the `skip()` method is anyways only a hint (it may skip less if it wants) and seeking works only in one direction, why not skip seek support completely? Also simplifies the code and makes it even more lightweight. As a followup: Should we add more built-in decompressors for other file endings, like `*.gz` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user sekruse commented on the pull request: https://github.com/apache/flink/pull/658#issuecomment-101064479 I wanted to integrate the gz support with the decorateStream method, therefore I was waiting for this PR to be merged. I can of course skip the seek, but I thought that forward seeking would be useful as it enables splittable file formats. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/658#issuecomment-101026234 I think Sebastian has already filed a JIRA for adding gz read support. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/658#discussion_r29920729 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/io/InputStreamFSInputWrapper.java --- @@ -0,0 +1,60 @@ +package org.apache.flink.api.common.io; --- End diff -- Missing license header, that's also why the build fails ;) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/658#issuecomment-100129770 Thank you for this pull request. I'll review it now. The first thing I saw is that the build has failed: ``` [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 10.728 s [INFO] Finished at: 2015-05-07T12:08:03+00:00 [INFO] Final Memory: 25M/448M [INFO] [ERROR] Failed to execute goal org.apache.rat:apache-rat-plugin:0.11:check (default) on project flink-parent: Too many files with unapproved license: 1 See RAT report in: /home/travis/build/apache/flink/target/rat.txt - [Help 1] ``` https://travis-ci.org/apache/flink/jobs/61608003 You can also do a local `mvn clean install -DskipTests` to run the rat plugin. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/658#issuecomment-100132056 The change looks good, its not completely consistent with our coding guidelines. But a few runs of mvn install -DskipTests will resolve this ;) (also, the indentation is done using spaces, we use tabs). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/658#discussion_r29920868 --- Diff: flink-core/src/test/java/org/apache/flink/api/common/io/FileInputFormatTest.java --- @@ -18,24 +18,20 @@ package org.apache.flink.api.common.io; -import java.io.BufferedOutputStream; -import java.io.BufferedWriter; -import java.io.File; -import java.io.FileOutputStream; -import java.io.FileWriter; -import java.io.IOException; +import java.io.*; --- End diff -- As per our coding guidelines, star imports are not allowed: Do not use wildcard imports in the core files. They can cause problems when adding to the code and in some cases even during refactoring. Exceptions are the Tuple classes, Tuple-related utilities, and Flink user programs, when importing operators/functions. Tests are a special case of the user programs. http://flink.apache.org/coding_guidelines.html --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/658#discussion_r29920838 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/io/InputStreamFSInputWrapper.java --- @@ -0,0 +1,60 @@ +package org.apache.flink.api.common.io; + +import org.apache.flink.core.fs.FSDataInputStream; + +import java.io.IOException; +import java.io.InputStream; + +/** + * This class wraps an {@link java.io.InputStream} and exposes it as {@link org.apache.flink.core.fs.FSDataInputStream}. + * br + * iNB: {@link #seek(long)} and {@link #getPos()} are currently not supported./i + */ +public class InputStreamFSInputWrapper extends FSDataInputStream { + +private final InputStream inStream; + +private long pos = 0; + +public InputStreamFSInputWrapper(InputStream inStream) { +this.inStream = inStream; +} + +@Override +public void seek(long desired) throws IOException { +if (desired this.pos) { +throw new IllegalArgumentException(Wrapped InputStream: cannot search backwards.); +} else if (desired == pos) { +return; +} + +this.inStream.skip(desired - pos); +this.pos = desired; +} + +@Override +public long getPos() throws IOException { +return this.pos; +} + +@Override +public int read() throws IOException { +int read = inStream.read(); +if (read != -1) { +this.pos++; +} +return read; +} + +@Override +public int read(byte[] b, int off, int len) throws IOException { +int numReadBytes = inStream.read(b, off, len); --- End diff -- This method will return `-1` when the stream is over. Will this corrupt the `pos` ? (Above, you're checking if `read != -1` ?) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user sekruse commented on the pull request: https://github.com/apache/flink/pull/658#issuecomment-100240346 Alright, I incorporated all the feedback. The build seems to be passing now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1980] allowing users to decorate input ...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/658#issuecomment-100245298 +1 to merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---