[GitHub] flink issue #6290: [Flink-9691] [Kinesis Connector] Attempt to call getRecor...
Github user jgrier commented on the issue: https://github.com/apache/flink/pull/6290 ð ---
[GitHub] flink pull request #6290: [Flink-9691] [Kinesis Connector] Attempt to call g...
GitHub user jgrier opened a pull request: https://github.com/apache/flink/pull/6290 [Flink-9691] [Kinesis Connector] Attempt to call getRecords() at correct frequency. ## What is the purpose of the change The purpose of this change is to make the Kinesis connector call getRecords() on the interval specified by the SHARD_GETRECORDS_INTERVAL_MILLIS configuration parameter. Without this change the Kinesis Connector sleeps() for the given interval regardless of how long it takes to process the records. This hurts badly when trying read through a backlog of data for example. With this change we instead time the loop and sleep() for the specified interval MINUS any time spent processing. This is more in line with what user's expect and makes the connector perform much better when there is a backlog of data to read through. ## Brief change log Time the run loop and subtract that time from the time spent sleeping. ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no You can merge this pull request into a Git repository by running: $ git pull https://github.com/lyft/flink flink-9691-fix-runloop Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/6290.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6290 commit 93028c19707776bc6af636e308f9b4763e2a727b Author: Jamie Grier Date: 2018-07-09T21:20:47Z Modify runloop to try to track a particular getRecords() frequency. commit 580c4001bc60cf4d92815047697a47bc6de96cf8 Author: Jamie Grier Date: 2018-07-09T21:41:59Z Fix indentation. ---
[GitHub] flink issue #2982: [FLINK-4460] Side Outputs in Flink
Github user jgrier commented on the issue: https://github.com/apache/flink/pull/2982 Nice :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3204: [FLINK-5634] Flink should not always redirect stdo...
Github user jgrier closed the pull request at: https://github.com/apache/flink/pull/3204 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3494: [FLINK-5635] [docker] Improve Docker tooling
Github user jgrier commented on a diff in the pull request: https://github.com/apache/flink/pull/3494#discussion_r105065007 --- Diff: flink-contrib/docker-flink/Dockerfile --- @@ -36,22 +31,24 @@ ENV PATH $PATH:$FLINK_HOME/bin EXPOSE 8081 EXPOSE 6123 +# flink-dist can point to a directory or a tarball on the local system +ARG flink_dist=NOT_SET + # Install build dependencies and flink +ADD $flink_dist $FLINK_INSTALL_PATH RUN set -x && \ - mkdir -p $FLINK_INSTALL_PATH && \ - apk --update add --virtual build-dependencies curl && \ - curl -s $(curl -s https://www.apache.org/dyn/closer.cgi\?preferred\=true)flink/flink-${FLINK_VERSION}/flink-${FLINK_VERSION}-bin-hadoop${HADOOP_VERSION}-scala_${SCALA_VERSION}.tgz | \ --- End diff -- Hi guys, thanks for picking up this PR and running with it. I think we should separate the "official" Docker images issue from this PR. This PR was about just simply improving the existing Docker tooling and making it easier for Flink developers to build their own images, maybe with their own tweaks, or specifying specific Flink releases to build the image from. On the "official" images topic I'm totally fine with either approach. dA could do this and maintain those official images or we could do it as part of Flink and the community can maintain them. The main thing for me is just seeing that it's super easy for people to get Flink running with Docker and there is some accepted "official" image for people to grab rather than wading through a bunch of Flink images generated by various people. That's bad for the user experience. Regardless, let's figure out separately how best to get an official image on DockerHub and get this change finished up because I think the changes here are useful as is. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3314: [FLINK-3679] DeserializationSchema should handle zero or ...
Github user jgrier commented on the issue: https://github.com/apache/flink/pull/3314 I think it would be just fine if we allowed a null return given the tradeoffs discussed here. The main thing was to allow users a way to deal with bad data with minimal effort and without throwing an exception and causing their job to restart. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3205: [FLINK-5635] Improve Docker tooling to make it easier to ...
Github user jgrier commented on the issue: https://github.com/apache/flink/pull/3205 @ex00 The `--from-local-dist` flag assumes you've already built Flink locally. If you run `mvn clean package -DskipTests` does this error persist? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3204: [FLINK-5634] Flink should not always redirect stdout to a...
Github user jgrier commented on the issue: https://github.com/apache/flink/pull/3204 Agreed.. If https://issues.apache.org/jira/browse/FLINK-4326 were merged this would also work fine. Whether or not Flink runs in the background or foreground is orthogonal to where the logs go â as long as Flink can be configured to log to stdout we're good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3205: [FLINK-5635] Improve Docker tooling to make it easier to ...
Github user jgrier commented on the issue: https://github.com/apache/flink/pull/3205 Thanks @ex00. I missed that. Actually there is a new Flink on Docker stub in the top-level docs and I was planning to document these scripts there, however we may want to keep this here and just link to it. Anyway, I will update the docs. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3205: [FLINK-5635] Improve Docker tooling to make it eas...
GitHub user jgrier opened a pull request: https://github.com/apache/flink/pull/3205 [FLINK-5635] Improve Docker tooling to make it easier to build images and launch Flink via Docker tools Improvements for Docker on Flink experience. This PR depends on this other one: https://github.com/apache/flink/pull/3204 - Modifying Dockerfile to allow building from local flink-dist as well as release URLs - Making image name configurable so it's easier for user's to use this to build an publish their own images - Logging to stdout rather than just files - Adding scripts to deploy seamlessly on Docker Swarm without host networking - Updating Docker Compose scripts to work correctly for deploying locally - Generally parameterizing things so these Docker scripts are more generally useful and self-documenting - Provides options so that you can deploy multiple Flink services with unique names on Docker Swarm - This should all work well with the new Docker "stack" stuff as well. Example usage: ``` # Build a Docker image from your local flink build ./build.sh --from-local-dist --image-name "dataartisans/flink" ``` ``` # Build a Docker image from an official release ./build.sh --from-release --flink-verison 1.2.0 --hadoop-version 2.7 --scala-version 2.11 --image-name "dataartisans/flink" ``` ``` # Run with Docker Compose docker-compose up -d docker-compose scale taskmanager=4 ``` ``` # Run on Docker Swarm ./create-docker-swarm-service.sh --image-name "dataartisans/flink" my-flink-service my-flink-port ``` ``` # Shut down ./remove-docker-swarm-service.sh --image-name "dataartisans/flink" my-flink-service ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/jgrier/flink release-1.2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3205.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3205 commit d380e98b397c770ee4b7eb9e4abb2387e7672d4c Author: Jamie Grier <jamie.gr...@gmail.com> Date: 2017-01-21T00:12:17Z Improvements for Docker on Flink experience. Modifying Dockerfile to build from local flink-dist as well as release URLs. Logging to stdout. Adding scripts to deploy seamlessly on Docker Swarm. Updating Docker Compose scripts to work correctly. Parameterizing things so these Docker scripts are more generally useful. commit 962290ab66d993cac4c4e233a017feb8a6eb1cc5 Author: Jamie Grier <jamie.gr...@gmail.com> Date: 2017-01-21T00:14:27Z Adding flag to flink-daemon.sh and related scripts so that we can actually log to the console -- which is better for Docker environments. commit 66c2133ede0ed9914baba2fe7b93f1b9d585ede2 Author: Jamie Grier <jamie.gr...@gmail.com> Date: 2017-01-25T05:45:23Z Adding Apache license to new files. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2772: [FLINK-5012] Expose Timestamp in Timely FlatMap Functions
Github user jgrier commented on the issue: https://github.com/apache/flink/pull/2772 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2387: [FLINK-4317, FLIP-3] [docs] Restructure docs
Github user jgrier commented on the issue: https://github.com/apache/flink/pull/2387 This looks great! I really like the left side navigation. I think this is a big improvement. One suggestion: Can we change the name of the "Testing & Debugging" section to something like "Monitoring and Debugging"?. Most of what's in that section is about monitoring. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2153: [FLINK-4110] Add testing skeleton to quickstart
Github user jgrier commented on the issue: https://github.com/apache/flink/pull/2153 What I would like to see here is that the tests in the quickstart actually test the code from the job in the quickstart. For example if the job in the quickstart is WordCount it would be nice to show tests that are actually testing that it counts words correctly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---