[GitHub] [samza-hello-samza] james-deee opened a new pull request, #87: Upgrade hello-samza to latest deps and prepare for Java 11
james-deee opened a new pull request, #87: URL: https://github.com/apache/samza-hello-samza/pull/87 This is preparing things so that this app will work with Java 11. Once this Samza app PR is merged and released, we can update this to use the that version in the pom.xml file. I am also deleting the gradle artifacts because this whole project is actually using Maven (not gradle, which was version 2 or something). Verified that these version bumps (including the use of Scala 2.12 from Samza) all work. Note this moves Hadoop to be a Java 11 compatible version as well. I have also tested with my local build from the Samza PR above, and using Java 11 that all of this works as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@samza.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [samza-hello-samza] dependabot[bot] opened a new pull request, #85: Bump hadoop-common from 2.6.1 to 3.2.3
dependabot[bot] opened a new pull request, #85: URL: https://github.com/apache/samza-hello-samza/pull/85 Bumps hadoop-common from 2.6.1 to 3.2.3. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.hadoop:hadoop-common=maven=2.6.1=3.2.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- Dependabot commands and options You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/apache/samza-hello-samza/network/alerts). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@samza.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [samza-hello-samza] naleon opened a new pull request #84: [FIX] version jetty transitive dependency.
naleon opened a new pull request #84: URL: https://github.com/apache/samza-hello-samza/pull/84 I got "java.lang.NoClassDefFoundError: org/eclipse/jetty/http/HttpCookie$SetCookieHttpField" at the time I tried deployed hello-samza-1.6.0-dist.tar.gz This dependency exclusion fixes it. You can see it in the dependency tree. ``` [INFO] [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hello-samza --- [INFO] org.apache.samza:hello-samza:jar:1.6.0 [INFO] +- junit:junit:jar:4.12:compile [INFO] | \- org.hamcrest:hamcrest-core:jar:1.3:compile [INFO] +- org.apache.samza:samza-api:jar:1.6.0:compile [INFO] | +- org.apache.commons:commons-lang3:jar:3.4:compile [INFO] | +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile [INFO] | +- com.google.code.gson:gson:jar:2.8.5:compile [INFO] | \- io.dropwizard.metrics:metrics-core:jar:3.1.2:compile [INFO] +- org.apache.samza:samza-azure_2.11:jar:1.6.0:compile [INFO] | +- com.azure:azure-storage-blob:jar:12.0.1:compile [INFO] | | +- com.azure:azure-core:jar:1.0.0:compile [INFO] | | | +- com.fasterxml.jackson.datatype:jackson-datatype-jsr310:jar:2.10.0:compile [INFO] | | | +- com.fasterxml.jackson.dataformat:jackson-dataformat-xml:jar:2.10.0:compile [INFO] | | | | +- com.fasterxml.jackson.module:jackson-module-jaxb-annotations:jar:2.10.0:compile [INFO] | | | | | +- jakarta.xml.bind:jakarta.xml.bind-api:jar:2.3.2:compile [INFO] | | | | | \- jakarta.activation:jakarta.activation-api:jar:1.2.1:compile [INFO] | | | | +- org.codehaus.woodstox:stax2-api:jar:4.2:compile [INFO] | | | | \- com.fasterxml.woodstox:woodstox-core:jar:6.0.1:compile [INFO] | | | \- io.projectreactor:reactor-core:jar:3.3.0.RELEASE:compile [INFO] | | | \- org.reactivestreams:reactive-streams:jar:1.0.3:compile [INFO] | | \- com.azure:azure-storage-common:jar:12.0.1:compile [INFO] | | \- com.azure:azure-core-http-netty:jar:1.0.0:compile [INFO] | |+- io.netty:netty-handler:jar:4.1.42.Final:compile [INFO] | || +- io.netty:netty-common:jar:4.1.42.Final:compile [INFO] | || +- io.netty:netty-transport:jar:4.1.42.Final:compile [INFO] | || | \- io.netty:netty-resolver:jar:4.1.42.Final:compile [INFO] | || \- io.netty:netty-codec:jar:4.1.42.Final:compile [INFO] | |+- io.netty:netty-handler-proxy:jar:4.1.42.Final:compile [INFO] | || \- io.netty:netty-codec-socks:jar:4.1.42.Final:compile [INFO] | |+- io.netty:netty-buffer:jar:4.1.42.Final:compile [INFO] | |+- io.netty:netty-codec-http:jar:4.1.42.Final:compile [INFO] | |+- io.projectreactor.netty:reactor-netty:jar:0.9.0.RELEASE:compile [INFO] | || +- io.netty:netty-codec-http2:jar:4.1.39.Final:compile [INFO] | || +- io.netty:netty-transport-native-epoll:jar:linux-x86_64:4.1.39.Final:compile [INFO] | || | \- io.netty:netty-transport-native-unix-common:jar:4.1.39.Final:compile [INFO] | || \- io.projectreactor.addons:reactor-pool:jar:0.1.0.RELEASE:compile [INFO] | |\- com.azure:azure-core-test:jar:1.0.0:compile [INFO] | | \- io.projectreactor:reactor-test:jar:3.3.0.RELEASE:compile [INFO] | +- com.microsoft.azure:azure-storage:jar:5.3.1:compile [INFO] | | \- com.microsoft.azure:azure-keyvault-core:jar:0.8.0:compile [INFO] | +- com.microsoft.azure:azure-eventhubs:jar:1.0.1:compile [INFO] | | \- org.apache.qpid:proton-j:jar:0.25.0:compile [INFO] | +- com.fasterxml.jackson.core:jackson-core:jar:2.10.0:compile [INFO] | \- org.apache.avro:avro:jar:1.7.7:compile [INFO] | +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile [INFO] | \- org.xerial.snappy:snappy-java:jar:1.0.5:compile [INFO] +- org.apache.samza:samza-core_2.11:jar:1.6.0:compile [INFO] | +- com.101tec:zkclient:jar:0.8:compile [INFO] | +- net.sf.jopt-simple:jopt-simple:jar:5.0.4:compile [INFO] | +- org.apache.commons:commons-collections4:jar:4.0:compile [INFO] | +- commons-io:commons-io:jar:2.6:compile [INFO] | +- org.eclipse.jetty:jetty-webapp:jar:9.4.20.v20190813:compile [INFO] | | +- org.eclipse.jetty:jetty-xml:jar:9.4.20.v20190813:compile [INFO] | | | \- org.eclipse.jetty:jetty-util:jar:9.4.20.v20190813:compile [INFO] | | \- org.eclipse.jetty:jetty-servlet:jar:9.4.20.v20190813:compile [INFO] | | \- org.eclipse.jetty:jetty-security:jar:9.4.20.v20190813:compile [INFO] | |\- org.eclipse.jetty:jetty-server:jar:9.4.20.v20190813:compile [INFO] | | \- org.eclipse.jetty:jetty-io:jar:9.4.20.v20190813:compile [INFO] | +- org.scala-lang:scala-library:jar:2.11.8:compile [INFO] | +- net.jodah:failsafe:jar:1.1.0:compile [INFO] | \- com.linkedin.cytodynamics:cytodynamics-nucleus:jar:0.2.0:compile [INFO] +-
[GitHub] [samza-hello-samza] mynameborat merged pull request #83: remove comited merge conflict.
mynameborat merged pull request #83: URL: https://github.com/apache/samza-hello-samza/pull/83 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [samza-hello-samza] li-afaris opened a new pull request #83: remove comited merge conflict.
li-afaris opened a new pull request #83: URL: https://github.com/apache/samza-hello-samza/pull/83 A merge conflict was checked in by mistake. This change removes the conflict & changes the hello-samza version to 1.6 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [samza-beam-examples] xinyuiscool opened a new pull request #3: Update Beam to 2.27 and samza to 1.3
xinyuiscool opened a new pull request #3: URL: https://github.com/apache/samza-beam-examples/pull/3 Update the beam examples on samza runner to use the latest beam version. Note that we found problems when dealing with splitable parDo so we use the flag --experiments=use_deprecated_read to disable it for now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [samza-hello-samza] mynameborat merged pull request #82: SAMZA-2586: Update samza-hello-samza#latest to use 1.6.0-SNAPSHOT
mynameborat merged pull request #82: URL: https://github.com/apache/samza-hello-samza/pull/82 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [samza-hello-samza] kw2542 opened a new pull request #82: SAMZA-2586: Update samza-hello-samza#latest to use 1.6.0-SNAPSHOT
kw2542 opened a new pull request #82: URL: https://github.com/apache/samza-hello-samza/pull/82 Issues: 1.5.0-SNAPSHOT is used in latest branch, which does not work with the instruction http://samza.apache.org/startup/hello-samza/latest/ Changes: Update to use 1.6.0-SNAPSHOT Tests: Deployed a job following instructions on http://samza.apache.org/startup/hello-samza/latest/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [samza-hello-samza] mpfeiffer00 commented on pull request #81: Local topic from wikipedia
mpfeiffer00 commented on pull request #81: URL: https://github.com/apache/samza-hello-samza/pull/81#issuecomment-675805177 Oh boy, looks pretty cool. I'm so glad it works this time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [samza-hello-samza] mpfeiffer00 closed pull request #81: Local topic from wikipedia
mpfeiffer00 closed pull request #81: URL: https://github.com/apache/samza-hello-samza/pull/81 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [samza-hello-samza] mpfeiffer00 opened a new pull request #81: Local topic from wikipedia
mpfeiffer00 opened a new pull request #81: URL: https://github.com/apache/samza-hello-samza/pull/81 Updated scripts and documentation to run a local kafka topic streamed from wikipedia. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [samza-hello-samza] mynameborat merged pull request #80: Merge latest branch into master and set version to 1.5.0
mynameborat merged pull request #80: URL: https://github.com/apache/samza-hello-samza/pull/80 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [samza-hello-samza] mynameborat opened a new pull request #80: Merge latest branch into master and set version to 1.5.0
mynameborat opened a new pull request #80: URL: https://github.com/apache/samza-hello-samza/pull/80 - Changes related to job runner configurations - Version bump to 1.5.0 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404985923 ## File path: gradle/wrapper/gradle-wrapper.properties ## @@ -1,6 +1,6 @@ -#Mon Mar 23 14:55:28 PDT 2015 +#Fri Mar 27 16:28:33 PDT 2020 +distributionUrl=https\://services.gradle.org/distributions/gradle-2.3-all.zip Review comment: It's pretty minor. It's ok to leave it in this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404502243 ## File path: README.md ## @@ -61,7 +61,7 @@ Package [samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr Package [samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application) contains a small Samza application which consumes the real-time feeds from Wikipedia, extracts the metadata of the events, and calculates statistics of all edits in a 10-second window. You can start the app on the grid using the run-app.sh script: ``` -./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties +./deploy/samza/bin/run-app.sh --config-path=$PWD/deploy/samza/config/wikipedia-application.properties Review comment: `LocalApplicationRunner` with `--config-path` is still submission config, no full config is needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404501952 ## File path: gradle/wrapper/gradle-wrapper.properties ## @@ -1,6 +1,6 @@ -#Mon Mar 23 14:55:28 PDT 2015 +#Fri Mar 27 16:28:33 PDT 2020 +distributionUrl=https\://services.gradle.org/distributions/gradle-2.3-all.zip Review comment: It is auto updated by the build tool. I think I can revert this to keep it separated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404501360 ## File path: src/main/config/azure-blob-application.properties ## @@ -30,10 +30,14 @@ yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}- # StreamApplication class app.class=samza.examples.azure.AzureBlobApplication -#Azure blob essential configs +# Azure blob essential configs systems.azure-blob-container.samza.factory=org.apache.samza.system.azureblob.AzureBlobSystemFactory sensitive.systems.azure-blob-container.azureblob.account.name=your-azure-storage-account-name sensitive.systems.azure-blob-container.azureblob.account.key=your-azure-storage-account-key +# Config Loader +job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory +job.config.loader.properties.path=./__package/config/azure-blob-application.properties Review comment: Yes, "__package" is Samza's internal implementation detail where localized job to be unzipped at. It is challenging to do it programmatically as clients may or may not include __package in the path, and different programs may have different locations to put their configs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404500446 ## File path: src/main/config/azure-blob-application.properties ## @@ -30,10 +30,14 @@ yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}- # StreamApplication class app.class=samza.examples.azure.AzureBlobApplication -#Azure blob essential configs +# Azure blob essential configs systems.azure-blob-container.samza.factory=org.apache.samza.system.azureblob.AzureBlobSystemFactory sensitive.systems.azure-blob-container.azureblob.account.name=your-azure-storage-account-name sensitive.systems.azure-blob-container.azureblob.account.key=your-azure-storage-account-key +# Config Loader +job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory +job.config.loader.properties.path=./__package/config/azure-blob-application.properties Review comment: Yes, `samza-hello-samza` build tarball already include config directory. Agree, the migration guide for 1.5 need detailed migration instructions on this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404293260 ## File path: README.md ## @@ -61,7 +61,7 @@ Package [samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr Package [samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application) contains a small Samza application which consumes the real-time feeds from Wikipedia, extracts the metadata of the events, and calculates statistics of all edits in a 10-second window. You can start the app on the grid using the run-app.sh script: ``` -./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties +./deploy/samza/bin/run-app.sh --config-path=$PWD/deploy/samza/config/wikipedia-application.properties Review comment: This `--config-path` here is for submission configs only, right? Can you think of a way to help clarify that? It looks like the example is suggesting to pass full job configs as submission configs. However, that could end up being a problem if the full job configs are too large (IIRC, there is a limit for the size of the submission configs env variable when passing to YARN). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404303771 ## File path: src/main/config/azure-blob-application.properties ## @@ -30,10 +30,14 @@ yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}- # StreamApplication class app.class=samza.examples.azure.AzureBlobApplication -#Azure blob essential configs +# Azure blob essential configs systems.azure-blob-container.samza.factory=org.apache.samza.system.azureblob.AzureBlobSystemFactory sensitive.systems.azure-blob-container.azureblob.account.name=your-azure-storage-account-name sensitive.systems.azure-blob-container.azureblob.account.key=your-azure-storage-account-key +# Config Loader +job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory +job.config.loader.properties.path=./__package/config/azure-blob-application.properties Review comment: If I understand correctly, in the past, it was not necessary to have this config file in the application package on the YARN containers. Was the `samza-hello-samza` build already set up to include the properties file into the application package? This may also be something you would need to call out in the migration documentation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404294708 ## File path: gradle/wrapper/gradle-wrapper.properties ## @@ -1,6 +1,6 @@ -#Mon Mar 23 14:55:28 PDT 2015 +#Fri Mar 27 16:28:33 PDT 2020 +distributionUrl=https\://services.gradle.org/distributions/gradle-2.3-all.zip Review comment: This change is intentional for this PR, right? If so, it's ok to keep it; just double checking since it isn't quite related. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404307632 ## File path: README.md ## @@ -61,7 +61,7 @@ Package [samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr Package [samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application) contains a small Samza application which consumes the real-time feeds from Wikipedia, extracts the metadata of the events, and calculates statistics of all edits in a 10-second window. You can start the app on the grid using the run-app.sh script: ``` -./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties +./deploy/samza/bin/run-app.sh --config-path=$PWD/deploy/samza/config/wikipedia-application.properties Review comment: On the other hand, if someone switched this to use `LocalApplicationRunner`, then would the `--config-path` need to be the full configs? Would this overloading of the `--config-path` argument be confusing? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404301906 ## File path: src/main/config/azure-blob-application.properties ## @@ -30,10 +30,14 @@ yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}- # StreamApplication class app.class=samza.examples.azure.AzureBlobApplication -#Azure blob essential configs +# Azure blob essential configs systems.azure-blob-container.samza.factory=org.apache.samza.system.azureblob.AzureBlobSystemFactory sensitive.systems.azure-blob-container.azureblob.account.name=your-azure-storage-account-name sensitive.systems.azure-blob-container.azureblob.account.key=your-azure-storage-account-key +# Config Loader +job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory +job.config.loader.properties.path=./__package/config/azure-blob-application.properties Review comment: Do you think it might be confusing to users what `__package` refers to? That is kind of a YARN implementation detail. Maybe documentation can help clarify, but if you can find a programatic way to hide that implementation detail, that would be nice too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r400365471 ## File path: README.md ## @@ -61,13 +61,19 @@ Package [samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr Package [samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application) contains a small Samza application which consumes the real-time feeds from Wikipedia, extracts the metadata of the events, and calculates statistics of all edits in a 10-second window. You can start the app on the grid using the run-app.sh script: ``` -./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties +./deploy/samza/bin/run-app.sh \ + --config app.class=samza.examples.wikipedia.application.WikipediaApplication \ + --config yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz \ + --config job.name=wikipedia-application \ + --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \ Review comment: Do you have a clean way to describe how to specify submission configs in general? Based on your comments, it seems like standalone (`LocalApplicationRunner`) specifies a different set of submission configs than YARN (`RemoteApplicationRunner`), even though some of those configs are general to Samza (e.g. `app.class`). It would be good to have as few runner-specific steps as possible. `ApplicationRunner` is the interface, so it would be nice to not have to worry about the specific `ApplicationRunner` being used when trying to start the app. I admit that Samza does already do some environment-specific configs (e.g. YARN-specific configs are needed when using `YarnJobFactory`), but we should generally minimize that. I'm not sure if this works, but could we recommend standalone to pass the larger set of submission configs (similar to YARN) also? Then there would be more consistency. It would be easier to describe what submission configs are and how to specify them in general. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r399599247 ## File path: README.md ## @@ -61,13 +61,19 @@ Package [samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr Package [samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application) contains a small Samza application which consumes the real-time feeds from Wikipedia, extracts the metadata of the events, and calculates statistics of all edits in a 10-second window. You can start the app on the grid using the run-app.sh script: ``` -./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties +./deploy/samza/bin/run-app.sh \ + --config app.class=samza.examples.wikipedia.application.WikipediaApplication \ + --config yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz \ + --config job.name=wikipedia-application \ + --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \ + --config job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory \ + --config job.config.loader.properties.path=$PWD/deploy/samza/config/wikipedia-application.properties Review comment: For standalone applications, we do not need to but for Yarn ones, Yes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r399599247 ## File path: README.md ## @@ -61,13 +61,19 @@ Package [samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr Package [samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application) contains a small Samza application which consumes the real-time feeds from Wikipedia, extracts the metadata of the events, and calculates statistics of all edits in a 10-second window. You can start the app on the grid using the run-app.sh script: ``` -./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties +./deploy/samza/bin/run-app.sh \ + --config app.class=samza.examples.wikipedia.application.WikipediaApplication \ + --config yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz \ + --config job.name=wikipedia-application \ + --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \ + --config job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory \ + --config job.config.loader.properties.path=$PWD/deploy/samza/config/wikipedia-application.properties Review comment: Yes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r399599309 ## File path: README.md ## @@ -61,13 +61,19 @@ Package [samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr Package [samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application) contains a small Samza application which consumes the real-time feeds from Wikipedia, extracts the metadata of the events, and calculates statistics of all edits in a 10-second window. You can start the app on the grid using the run-app.sh script: ``` -./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties +./deploy/samza/bin/run-app.sh \ + --config app.class=samza.examples.wikipedia.application.WikipediaApplication \ + --config yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz \ + --config job.name=wikipedia-application \ + --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \ Review comment: properties files cannot be simplified as they may also be used for standalone deployment where we do not need to feed the submission config. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r399576572 ## File path: README.md ## @@ -61,13 +61,19 @@ Package [samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr Package [samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application) contains a small Samza application which consumes the real-time feeds from Wikipedia, extracts the metadata of the events, and calculates statistics of all edits in a 10-second window. You can start the app on the grid using the run-app.sh script: ``` -./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties +./deploy/samza/bin/run-app.sh \ + --config app.class=samza.examples.wikipedia.application.WikipediaApplication \ + --config yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz \ + --config job.name=wikipedia-application \ + --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \ Review comment: submission related configs, such as job.name, app.class, job.factory.class and yarn.package.path needs to be explicitly passed in during submission since we are not reading config files anymore during submission. I am cleaning up properties file ATM as well. I believe the previous PR missed the instruction that all submission related configs needs to be provided explicitly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r399573818 ## File path: README.md ## @@ -61,13 +61,19 @@ Package [samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr Package [samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application) contains a small Samza application which consumes the real-time feeds from Wikipedia, extracts the metadata of the events, and calculates statistics of all edits in a 10-second window. You can start the app on the grid using the run-app.sh script: ``` -./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties +./deploy/samza/bin/run-app.sh \ + --config app.class=samza.examples.wikipedia.application.WikipediaApplication \ + --config yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz \ + --config job.name=wikipedia-application \ + --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \ + --config job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory \ + --config job.config.loader.properties.path=$PWD/deploy/samza/config/wikipedia-application.properties Review comment: There are several other places which did not have these additional configs (e.g. `run-event-hubs-zk-application.sh`, `CouchbaseTableExample.java`). Do those places need to be updated also? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader
cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r399574730 ## File path: README.md ## @@ -61,13 +61,19 @@ Package [samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr Package [samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application) contains a small Samza application which consumes the real-time feeds from Wikipedia, extracts the metadata of the events, and calculates statistics of all edits in a 10-second window. You can start the app on the grid using the run-app.sh script: ``` -./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties +./deploy/samza/bin/run-app.sh \ + --config app.class=samza.examples.wikipedia.application.WikipediaApplication \ + --config yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz \ + --config job.name=wikipedia-application \ + --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \ Review comment: These all seem to be copied from the properties file. It seems like it would be non-trivial to keep the properties file and this list of configs consistent, since they are in different places. https://github.com/apache/samza/pull/1256 indicates that you only need the config loader and properties path, but here you are adding several other configs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] kw2542 opened a new pull request #79: Update doc and javadoc from config factory to config loader
kw2542 opened a new pull request #79: Update doc and javadoc from config factory to config loader URL: https://github.com/apache/samza-hello-samza/pull/79 Update doc and javadoc from config factory to config loader This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 merged pull request #78: [Minor] updating latest branch to use 1.5.0-SNAPSHOT for samza version
cameronlee314 merged pull request #78: [Minor] updating latest branch to use 1.5.0-SNAPSHOT for samza version URL: https://github.com/apache/samza-hello-samza/pull/78 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 merged pull request #77: Merge latest branch into master and set version to 1.4.0
cameronlee314 merged pull request #77: Merge latest branch into master and set version to 1.4.0 URL: https://github.com/apache/samza-hello-samza/pull/77 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] mynameborat commented on a change in pull request #77: Merge latest branch into master and set version to 1.4.0
mynameborat commented on a change in pull request #77: Merge latest branch into master and set version to 1.4.0 URL: https://github.com/apache/samza-hello-samza/pull/77#discussion_r395190224 ## File path: bin/deploy.sh ## @@ -23,4 +23,4 @@ base_dir=`pwd` mvn clean package mkdir -p $base_dir/deploy/samza -tar -xvf $base_dir/target/hello-samza-1.2.0-dist.tar.gz -C $base_dir/deploy/samza +tar -xvf $base_dir/target/hello-samza-1.1.0-dist.tar.gz -C $base_dir/deploy/samza Review comment: should this be 1.4.0? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 opened a new pull request #78: [Minor] updating latest branch to use 1.5.0-SNAPSHOT for samza version
cameronlee314 opened a new pull request #78: [Minor] updating latest branch to use 1.5.0-SNAPSHOT for samza version URL: https://github.com/apache/samza-hello-samza/pull/78 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 opened a new pull request #77: Merge latest branch into master and set version to 1.4.0
cameronlee314 opened a new pull request #77: Merge latest branch into master and set version to 1.4.0 URL: https://github.com/apache/samza-hello-samza/pull/77 The Samza 1.4.0 release was just completed, so updating master to be up-to-date with latest. Most of this diff is by doing a `git merge latest`, but there were a few minor inconsistencies that I also cleaned up. I also removed the `-SNAPSHOT` part from the Samza version. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on issue #76: Sync latest branch with master
cameronlee314 commented on issue #76: Sync latest branch with master URL: https://github.com/apache/samza-hello-samza/pull/76#issuecomment-600907533 > Thanks for the changes. Was this a one off miss on our end or is there a lack of explicit guideline on sync between latest & master and when to cherry pick commits? I think there is a lack of explicit guidelines on how we want to manage `master` and `latest`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 merged pull request #76: Sync latest branch with master
cameronlee314 merged pull request #76: Sync latest branch with master URL: https://github.com/apache/samza-hello-samza/pull/76 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 merged pull request #74: [minor] remove unused files that seem to be left over from an improper merge/cleanup
cameronlee314 merged pull request #74: [minor] remove unused files that seem to be left over from an improper merge/cleanup URL: https://github.com/apache/samza-hello-samza/pull/74 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 opened a new pull request #76: Sync latest branch with master
cameronlee314 opened a new pull request #76: Sync latest branch with master URL: https://github.com/apache/samza-hello-samza/pull/76 When updating samza-hello-samza to use Samza 1.4, I noticed that the `latest` branch and the `master` branch were out-of-sync. I cherry-picked some commits that were only checked in to `master`. I also made some other minor changes which were on `master` but not on `latest`. Cherry-picks: f4cd658b751bac46cbf42b2c612c68ea23de2d47 (Adding hello-samza example for kinesis) 674d842e4b75d9003eaff9722596bb4c61db9fe2 (Adding Samza SQL Examples) 1c16cf03897b02178b695bb5c01ff90a9ca16406 (fix sql's join and aggregate notes typo) 67c989219353c7dfa60584ca0bc5a31ba302ab28 (Fixed rat issues on master) Other minor changes: kafka-console-consumer command in `README.md` add config to `conf/yarn-site.xml` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 merged pull request #75: add license for PageViewAvroRecord
cameronlee314 merged pull request #75: add license for PageViewAvroRecord URL: https://github.com/apache/samza-hello-samza/pull/75 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] lakshmi-manasa-g opened a new pull request #75: add license for PageViewAvroRecord
lakshmi-manasa-g opened a new pull request #75: add license for PageViewAvroRecord URL: https://github.com/apache/samza-hello-samza/pull/75 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 opened a new pull request #74: [minor] remove unused files that seem to be left over from an improper merge/cleanup
cameronlee314 opened a new pull request #74: [minor] remove unused files that seem to be left over from an improper merge/cleanup URL: https://github.com/apache/samza-hello-samza/pull/74 I was trying to sync `master` with `latest`, and these files were on `master` but not on `latest`. It looks like `AdClick` and `PageView` had gotten moved to a different package, but the old versions never got deleted. `AvroSerDeFactory` was committed with https://github.com/apache/samza-hello-samza/pull/41, but the other parts of that PR no longer exist (seems like it is replaced by https://github.com/apache/samza-hello-samza/pull/46). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 merged pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
cameronlee314 merged pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] lakshmi-manasa-g commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
lakshmi-manasa-g commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r379603189 ## File path: pom.xml ## @@ -206,6 +206,26 @@ under the License. guava 23.0 + Review comment: updating jackson-core to 2.10.0 in Samza-azure (https://github.com/apache/samza/pull/1277) after pulling in recent commits from 'latest' branch and updating jackson-core in samza-azure, these dependencies are not needed here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] JLLeitschuh opened a new pull request #73: [SECURITY] Use HTTPS to resolve dependencies in Maven Build
JLLeitschuh opened a new pull request #73: [SECURITY] Use HTTPS to resolve dependencies in Maven Build URL: https://github.com/apache/samza-hello-samza/pull/73 [![mitm_build](https://user-images.githubusercontent.com/1323708/59226671-90645200-8ba1-11e9-8ab3-39292bef99e9.jpeg)](https://medium.com/@jonathan.leitschuh/want-to-take-over-the-java-ecosystem-all-you-need-is-a-mitm-1fc329d898fb?source=friends_link=3c99970c55a899ad9ef41f126efcde0e) - [Want to take over the Java ecosystem? All you need is a MITM!](https://medium.com/@jonathan.leitschuh/want-to-take-over-the-java-ecosystem-all-you-need-is-a-mitm-1fc329d898fb?source=friends_link=3c99970c55a899ad9ef41f126efcde0e) - [Update: Want to take over the Java ecosystem? All you need is a MITM!](https://medium.com/bugbountywriteup/update-want-to-take-over-the-java-ecosystem-all-you-need-is-a-mitm-d069d253fe23?source=friends_link=8c8e52a7d57b98d0b7e541665688b454) --- This is a security fix for a vulnerability in your [Apache Maven](https://maven.apache.org/) `pom.xml` file(s). The build files indicate that this project is resolving dependencies over HTTP instead of HTTPS. This leaves your build vulnerable to allowing a [Man in the Middle](https://en.wikipedia.org/wiki/Man-in-the-middle_attack) (MITM) attackers to execute arbitrary code on your or your computer or CI/CD system. This vulnerability has a CVSS v3.0 Base Score of [8.1/10](https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H). [POC code](https://max.computer/blog/how-to-take-over-the-computer-of-any-java-or-clojure-or-scala-developer/) has existed since 2014 to maliciously compromise a JAR file in-flight. MITM attacks against HTTP are [increasingly common](https://security.stackexchange.com/a/12050), for example [Comcast is known to have done it to their own users](https://thenextweb.com/insights/2017/12/11/comcast-continues-to-inject-its-own-code-into-websites-you-visit/#). This contribution is a part of a submission to the [GitHub Security Lab](https://securitylab.github.com/) Bug Bounty program. ## Detecting this and Future Vulnerabilities This vulnerability was automatically detected by [LGTM.com](https://lgtm.com) using this [CodeQL Query](https://lgtm.com/rules/155648721/). As of September 2019 LGTM.com and Semmle are [officially a part of GitHub](https://github.blog/2019-09-18-github-welcomes-semmle/). You can automatically detect future vulnerabilities like this by enabling the free (for open-source) [LGTM App](https://github.com/marketplace/lgtm). I'm not an employee of GitHub nor of Semmle, I'm simply a user of [LGTM.com](https://lgtm.com) and an open-source security researcher. ## Source Yes, this contribution was automatically generated, however, the code to generate this PR was lovingly hand crafted to bring this security fix to your repository. The source code that generated and submitted this PR can be found here: [JLLeitschuh/bulk-security-pr-generator](https://github.com/JLLeitschuh/bulk-security-pr-generator) ## Opting-Out If you'd like to opt-out of future automated security vulnerability fixes like this, please consider adding a file called `.github/GH-ROBOTS.txt` to your repository with the line: ``` User-agent: JLLeitschuh/bulk-security-pr-generator Disallow: * ``` This bot will respect the [ROBOTS.txt](https://moz.com/learn/seo/robotstxt) format for future contributions. Alternatively, if this project is no longer actively maintained, consider [archiving](https://help.github.com/en/github/creating-cloning-and-archiving-repositories/about-archiving-repositories) the repository. ## CLA Requirements _This section is only relevant if your project requires contributors to sign a Contributor License Agreement (CLA) for external contributions._ It is unlikely that I'll be able to directly sign CLAs. However, all contributed commits are already automatically signed-off. > The meaning of a signoff depends on the project, but it typically certifies that committer has the rights to submit this work under the same license and agrees to a Developer Certificate of Origin > (see [https://developercertificate.org/](https://developercertificate.org/) for more information). > > \- [Git Commit Signoff documentation](https://developercertificate.org/) If signing your organization's CLA is a strict-requirement for merging this contribution, please feel free to close this PR. ## Tracking All PR's generated as part of this fix are tracked here: https://github.com/JLLeitschuh/bulk-security-pr-generator/issues/2 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use
[GitHub] [samza-hello-samza] cameronlee314 opened a new pull request #72: SAMZA-2449: Create an example job in samza-hello-samza for job coordinator split deployment
cameronlee314 opened a new pull request #72: SAMZA-2449: Create an example job in samza-hello-samza for job coordinator split deployment URL: https://github.com/apache/samza-hello-samza/pull/72 Feature: Adding an example for how to set up job coordinator dependency isolation in samza-hello-samza Changes: 1. Updated build.gradle to include tasks to build framework artifacts. 2. Added a new config for a job to be deployed in a job coordinator dependency isolation mode. Usage instructions: 1. Normally, the samza-hello-samza artifact would be built by running `./gradlew distTar`. For dependency isolation, run `./gradlew distTar frameworkApiDistTar frameworkInfrastructureDistTar`, which will build the framework API and infrastructure packages. 2. The `wikipedia-application-with-framework.properties` config is set up to run in dependency isolation mode. Use that as the argument to the `--config-path` option when running with `run-app.sh`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371968280 ## File path: pom.xml ## @@ -206,6 +206,26 @@ under the License. guava 23.0 + Review comment: Does that mean that the azure library needs Jackson, but it does not pull it in transitively? That seems odd. Or maybe check if some other dependency is using a version of Jackson that is incompatible with the azure client. In any case, can you please add a comment about why this is necessary? It might be something you need to note on a user guide if there is one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371968359 ## File path: pom.xml ## @@ -206,6 +206,26 @@ under the License. guava 23.0 + + + com.fasterxml.jackson.core + jackson-core + 2.10.0 + + Review comment: minor: whitespace This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371968434 ## File path: pom.xml ## @@ -206,6 +206,26 @@ under the License. guava 23.0 + + + com.fasterxml.jackson.core + jackson-core + 2.10.0 + + + + com.fasterxml.jackson.core + jackson-databind + 2.10.0 + + + + Review comment: minor: whitespace This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] lakshmi-manasa-g commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
lakshmi-manasa-g commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371926907 ## File path: src/main/java/samza/examples/azure/AzureBlobApplication.java ## @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package samza.examples.azure; + +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import java.util.List; +import java.util.Map; +import org.apache.samza.application.StreamApplication; +import org.apache.samza.application.descriptors.StreamApplicationDescriptor; +import org.apache.samza.operators.MessageStream; +import org.apache.samza.operators.OutputStream; +import org.apache.samza.serializers.JsonSerdeV2; +import org.apache.samza.serializers.NoOpSerde; +import org.apache.samza.system.descriptors.GenericOutputDescriptor; +import org.apache.samza.system.descriptors.GenericSystemDescriptor; +import org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor; +import org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor; +import samza.examples.azure.data.PageViewAvroRecord; +import samza.examples.cookbook.data.PageView; + +/** + * In this example, we demonstrate sending blobs to Azure Blob Storage. + * This Samza job reads from Kafka topic "page-view-azure-blob-input" and produces blobs to Azure-Container "oss-testcontainer" in your Azure Storage account. + * + * Currently, Samza supports sending Avro files are blobs. + * Hence the incoming messages into the Samza job have to be converted to an Avro record. + * For this job, we use input message as {@link samza.examples.cookbook.data.PageView} and + * covert it to an Avro record defined as {@link samza.examples.azure.data.PageViewAvroRecord}. + * + * To run the below example: + * + * + * + * Replace your-azure-storage-account-name and your-azure-storage-account-key with details of your Azure Storage Account. + * + * + * Ensure that the topic "page-view-azure-blob-input" is created + * ./deploy/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic page-view-azure-blob-input --partitions 1 --replication-factor 1 + * + * + * Run the application using the run-app.sh script + * ./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/azure-blob-application.properties + * + * + * Produce some messages to the "page-view-azure-blob-input" topic + * ./deploy/kafka/bin/kafka-console-producer.sh --topic page-view-azure-blob-input --broker-list localhost:9092 + * {"userId": "user1", "country": "india", "pageId":"google.com"} + * {"userId": "user2", "country": "france", "pageId":"facebook.com"} + * {"userId": "user3", "country": "china", "pageId":"yahoo.com"} + * {"userId": "user4", "country": "italy", "pageId":"linkedin.com"} + * {"userId": "user5", "country": "germany", "pageId":"amazon.com"} + * {"userId": "user6", "country": "denmark", "pageId":"apple.com"} + * + * + *Seeing Output: + * + * + * See blobs in your Azure portal at https://.blob.core.windows.net/oss-testcontainer/PageViewEventStream/.avro + * + * + * system-name "oss-testcontainer" in configs and code below maps to Azure-Container in Azure Storage account. + * + * + *is of the format /MM/dd/HH/mm-ss-randomString.avro. Hence navigate through the virtual folders on the portal to see your blobs. + * + * + * Due to network calls, allow a few minutes for blobs to appear on the portal. + * + * + * Config "maxMessagesPerBlob=2" ensures that a blob is created per 2 input messages. Adjust input or config accordingly. + * + * + * + * + */ +public class AzureBlobApplication implements StreamApplication { + private static final List KAFKA_CONSUMER_ZK_CONNECT = ImmutableList.of("localhost:2181"); + private static final List KAFKA_PRODUCER_BOOTSTRAP_SERVERS = ImmutableList.of("localhost:9092"); + private
[GitHub] [samza-hello-samza] lakshmi-manasa-g commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
lakshmi-manasa-g commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371915951 ## File path: pom.xml ## @@ -206,6 +206,26 @@ under the License. guava 23.0 + Review comment: No the app does not need them directly. But azure-storage-blob depends on azure-core which in-turn needs Jackson. removing these dependencies throws `java.lang.NoClassDefFoundError: com/fasterxml/jackson/core/TSFBuilder at com.fasterxml.jackson.dataformat.xml.XmlMapper.(XmlMapper.java:122) ~[jackson-dataformat-xml-2.10.0.jar:2.10.0] at com.azure.core.implementation.serializer.jackson.JacksonAdapter.(JacksonAdapter.java:75) ~[azure-core-1.0.0.jar:?] at com.azure.core.implementation.serializer.jackson.JacksonAdapter.createDefaultSerializerAdapter(JacksonAdapter.java:108) ~[azure-core-1.0.0.jar:?] at com.azure.core.implementation.RestProxy.createDefaultSerializer(RestProxy.java:629) ~[azure-core-1.0.0.jar:?] at com.azure.core.implementation.RestProxy.create(RestProxy.java:691) ~[azure-core-1.0.0.jar:?] at com.azure.storage.blob.implementation.ServicesImpl.(ServicesImpl.java:58) ~[azure-storage-blob-12.0.1.jar:?] at com.azure.storage.blob.implementation.AzureBlobStorageImpl.(AzureBlobStorageImpl.java:213) ~[azure-storage-blob-12.0.1.jar:?] at com.azure.storage.blob.implementation.AzureBlobStorageBuilder.build(AzureBlobStorageBuilder.java:90) ~[azure-storage-blob-12.0.1.jar:?] at com.azure.storage.blob.BlobServiceAsyncClient.(BlobServiceAsyncClient.java:90) ~[azure-storage-blob-12.0.1.jar:?] at com.azure.storage.blob.BlobServiceClientBuilder.buildAsyncClient(BlobServiceClientBuilder.java:103) ~[azure-storage-blob-12.0.1.jar:?] at org.apache.samza.system.azureblob.producer.AzureBlobSystemProducer.setupAzureContainer(AzureBlobSystemProducer.java:370) ~[samza-azure_2.11-1.4.903-SNAPSHOT.jar:?]` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371586350 ## File path: src/main/java/samza/examples/azure/AzureBlobApplication.java ## @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package samza.examples.azure; + +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import java.util.List; +import java.util.Map; +import org.apache.samza.application.StreamApplication; +import org.apache.samza.application.descriptors.StreamApplicationDescriptor; +import org.apache.samza.operators.MessageStream; +import org.apache.samza.operators.OutputStream; +import org.apache.samza.serializers.JsonSerdeV2; +import org.apache.samza.serializers.NoOpSerde; +import org.apache.samza.system.descriptors.GenericOutputDescriptor; +import org.apache.samza.system.descriptors.GenericSystemDescriptor; +import org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor; +import org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor; +import samza.examples.azure.data.PageViewAvroRecord; +import samza.examples.cookbook.data.PageView; + +/** + * In this example, we demonstrate sending blobs to Azure Blob Storage. + * This Samza job reads from Kafka topic "page-view-azure-blob-input" and produces blobs to Azure-Container "oss-testcontainer" in your Azure Storage account. + * + * Currently, Samza supports sending Avro files are blobs. + * Hence the incoming messages into the Samza job have to be converted to an Avro record. + * For this job, we use input message as {@link samza.examples.cookbook.data.PageView} and + * covert it to an Avro record defined as {@link samza.examples.azure.data.PageViewAvroRecord}. + * + * To run the below example: + * + * + * + * Replace your-azure-storage-account-name and your-azure-storage-account-key with details of your Azure Storage Account. + * + * + * Ensure that the topic "page-view-azure-blob-input" is created + * ./deploy/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic page-view-azure-blob-input --partitions 1 --replication-factor 1 + * + * + * Run the application using the run-app.sh script + * ./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/azure-blob-application.properties + * + * + * Produce some messages to the "page-view-azure-blob-input" topic + * ./deploy/kafka/bin/kafka-console-producer.sh --topic page-view-azure-blob-input --broker-list localhost:9092 + * {"userId": "user1", "country": "india", "pageId":"google.com"} + * {"userId": "user2", "country": "france", "pageId":"facebook.com"} + * {"userId": "user3", "country": "china", "pageId":"yahoo.com"} + * {"userId": "user4", "country": "italy", "pageId":"linkedin.com"} + * {"userId": "user5", "country": "germany", "pageId":"amazon.com"} + * {"userId": "user6", "country": "denmark", "pageId":"apple.com"} + * + * + *Seeing Output: + * + * + * See blobs in your Azure portal at https://.blob.core.windows.net/oss-testcontainer/PageViewEventStream/.avro + * + * + * system-name "oss-testcontainer" in configs and code below maps to Azure-Container in Azure Storage account. + * + * + *is of the format /MM/dd/HH/mm-ss-randomString.avro. Hence navigate through the virtual folders on the portal to see your blobs. + * + * + * Due to network calls, allow a few minutes for blobs to appear on the portal. + * + * + * Config "maxMessagesPerBlob=2" ensures that a blob is created per 2 input messages. Adjust input or config accordingly. + * + * + * + * + */ +public class AzureBlobApplication implements StreamApplication { + private static final List KAFKA_CONSUMER_ZK_CONNECT = ImmutableList.of("localhost:2181"); + private static final List KAFKA_PRODUCER_BOOTSTRAP_SERVERS = ImmutableList.of("localhost:9092"); + private
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371586933 ## File path: src/main/java/samza/examples/azure/AzureBlobApplication.java ## @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package samza.examples.azure; + +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import java.util.List; +import java.util.Map; +import org.apache.samza.application.StreamApplication; +import org.apache.samza.application.descriptors.StreamApplicationDescriptor; +import org.apache.samza.operators.MessageStream; +import org.apache.samza.operators.OutputStream; +import org.apache.samza.serializers.JsonSerdeV2; +import org.apache.samza.serializers.NoOpSerde; +import org.apache.samza.system.descriptors.GenericOutputDescriptor; +import org.apache.samza.system.descriptors.GenericSystemDescriptor; +import org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor; +import org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor; +import samza.examples.azure.data.PageViewAvroRecord; +import samza.examples.cookbook.data.PageView; + +/** + * In this example, we demonstrate sending blobs to Azure Blob Storage. + * This Samza job reads from Kafka topic "page-view-azure-blob-input" and produces blobs to Azure-Container "oss-testcontainer" in your Azure Storage account. + * + * Currently, Samza supports sending Avro files are blobs. + * Hence the incoming messages into the Samza job have to be converted to an Avro record. + * For this job, we use input message as {@link samza.examples.cookbook.data.PageView} and + * covert it to an Avro record defined as {@link samza.examples.azure.data.PageViewAvroRecord}. + * + * To run the below example: + * + * + * + * Replace your-azure-storage-account-name and your-azure-storage-account-key with details of your Azure Storage Account. + * + * + * Ensure that the topic "page-view-azure-blob-input" is created + * ./deploy/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic page-view-azure-blob-input --partitions 1 --replication-factor 1 + * + * + * Run the application using the run-app.sh script + * ./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/azure-blob-application.properties + * + * + * Produce some messages to the "page-view-azure-blob-input" topic + * ./deploy/kafka/bin/kafka-console-producer.sh --topic page-view-azure-blob-input --broker-list localhost:9092 + * {"userId": "user1", "country": "india", "pageId":"google.com"} + * {"userId": "user2", "country": "france", "pageId":"facebook.com"} + * {"userId": "user3", "country": "china", "pageId":"yahoo.com"} + * {"userId": "user4", "country": "italy", "pageId":"linkedin.com"} + * {"userId": "user5", "country": "germany", "pageId":"amazon.com"} + * {"userId": "user6", "country": "denmark", "pageId":"apple.com"} + * + * + *Seeing Output: + * + * + * See blobs in your Azure portal at https://.blob.core.windows.net/oss-testcontainer/PageViewEventStream/.avro + * + * + * system-name "oss-testcontainer" in configs and code below maps to Azure-Container in Azure Storage account. + * + * + *is of the format /MM/dd/HH/mm-ss-randomString.avro. Hence navigate through the virtual folders on the portal to see your blobs. + * + * + * Due to network calls, allow a few minutes for blobs to appear on the portal. + * + * + * Config "maxMessagesPerBlob=2" ensures that a blob is created per 2 input messages. Adjust input or config accordingly. + * + * + * + * + */ +public class AzureBlobApplication implements StreamApplication { + private static final List KAFKA_CONSUMER_ZK_CONNECT = ImmutableList.of("localhost:2181"); + private static final List KAFKA_PRODUCER_BOOTSTRAP_SERVERS = ImmutableList.of("localhost:9092"); + private
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371585315 ## File path: src/main/config/azure-blob-application.properties ## @@ -0,0 +1,37 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# Job +job.factory.class=org.apache.samza.job.yarn.YarnJobFactory +job.name=azure-blob + +# YARN package path +yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-dist.tar.gz + +# StreamApplication class +app.class=samza.examples.azure.AzureBlobApplication + +#Azure blob essential configs +systems.oss-testcontainer.samza.factory=org.apache.samza.system.azureblob.AzureBlobSystemFactory +sensitive.systems.oss-testcontainer.azureblob.account.name=your-azure-storage-account-name +sensitive.systems.oss-testcontainer.azureblob.account.key=your-azure-storage-account-key + +#Azure blob config - to created a blob per 2 input kafka messages +systems.oss-testcontainer.azureblob.maxMessagesPerBlob=2 + +# Add configuration to disable checkpointing for this job once it is available in the Coordinator Stream model +# See https://issues.apache.org/jira/browse/SAMZA-465?focusedCommentId=14533346=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14533346 for more details Review comment: Is this necessary? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371586679 ## File path: src/main/java/samza/examples/azure/data/PageViewAvroRecord.java ## @@ -0,0 +1,48 @@ +package samza.examples.azure.data; + +import java.io.Serializable; +import org.apache.avro.AvroRuntimeException; +import samza.examples.cookbook.data.PageView; + +public class PageViewAvroRecord extends org.apache.avro.specific.SpecificRecordBase +implements org.apache.avro.specific.SpecificRecord, Serializable { + public final org.apache.avro.Schema SCHEMA = org.apache.avro.Schema.parse( + "{\"type\":\"record\",\"name\":\"PageViewAvroRecord\",\"namespace\":\"org.apache.samza.examples.events\", \"fields\":[{\"name\": \"userId\", \"type\": \"string\"}, {\"name\": \"country\", \"type\": \"string\"}, {\"name\": \"pageId\", \"type\": \"string\"}]}"); + + private String userId; + private String country; + private String pageId; + + public static PageViewAvroRecord buildPageViewRecord(PageView pageView) { +PageViewAvroRecord record = new PageViewAvroRecord(); +record.put(0, pageView.userId); Review comment: Would it be cleaner to do `record.userId = pageView.userId`? Same for the other fields below. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371585166 ## File path: src/main/config/azure-blob-application.properties ## @@ -0,0 +1,37 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# Job +job.factory.class=org.apache.samza.job.yarn.YarnJobFactory +job.name=azure-blob + +# YARN package path +yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-dist.tar.gz + +# StreamApplication class +app.class=samza.examples.azure.AzureBlobApplication + +#Azure blob essential configs +systems.oss-testcontainer.samza.factory=org.apache.samza.system.azureblob.AzureBlobSystemFactory +sensitive.systems.oss-testcontainer.azureblob.account.name=your-azure-storage-account-name +sensitive.systems.oss-testcontainer.azureblob.account.key=your-azure-storage-account-key + +#Azure blob config - to created a blob per 2 input kafka messages +systems.oss-testcontainer.azureblob.maxMessagesPerBlob=2 Review comment: "oss-testcontainer" is not really a descriptive term here. Can you just use "azure-blob" (or something like that)? If it needs to be a specific format, then please explain that in the comments for this file. It looks like you put instructions in `AzureBlobApplication`. Maybe put a note in this file that there are usage instructions in that class. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371584567 ## File path: pom.xml ## @@ -206,6 +206,26 @@ under the License. guava 23.0 + Review comment: Does the app depend directly on Jackson? If not, then you shouldn't need these dependencies. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 merged pull request #70: SAMZA-2433: Use log4j2 in samza-hello-samza
cameronlee314 merged pull request #70: SAMZA-2433: Use log4j2 in samza-hello-samza URL: https://github.com/apache/samza-hello-samza/pull/70 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza
cameronlee314 commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza URL: https://github.com/apache/samza-hello-samza/pull/70#issuecomment-577936234 To clarify a bit further in the split deployment case: If a job is set up to use log4j2 in the non-split-deployment case, then it will still work in the split deployment case if the split deployment framework is set up for log4j2. The impact of split deployment is that a particular framework package can only support one of log4j v1 or v2 as the slf4j binding. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza
cameronlee314 commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza URL: https://github.com/apache/samza-hello-samza/pull/70#issuecomment-577935029 > @cameronlee314 does this mean that we'd require jobs to upgrade to log4j2 as well? No, it is not required for jobs to upgrade to log4j2. If not using split deployment, then both log4j v1 and v2 would continue to work as is. If using split deployment, then the split deployment framework package determines whether log4j v1 or v2 needs to be used by a job (although I have currently only prototyped supporting log4j2). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] prateekm commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza
prateekm commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza URL: https://github.com/apache/samza-hello-samza/pull/70#issuecomment-577891882 @cameronlee314 does this mean that we'd require jobs to upgrade to log4j2 as well? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] prateekm commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza
prateekm commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza URL: https://github.com/apache/samza-hello-samza/pull/70#issuecomment-577829358 @PawasChhokra Can you take a look as well? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] lakshmi-manasa-g opened a new pull request #71: Sample for producing to Azure Blob Storage
lakshmi-manasa-g opened a new pull request #71: Sample for producing to Azure Blob Storage URL: https://github.com/apache/samza-hello-samza/pull/71 Feature: Sample Samza job related to SEP-26: Azure Blob System Producer Changes: New high-level Yarn job "AzureBlobApplication" added to the samples Tests: Sample job successfully produces blobs for configured Azure Storage Account. Confirmed on Azure portal. Upgrade instructions: Backwards compatible change as its a completely new job and hence no upgrade needed. Usage instructions: Add Azure Storage Account details to the configs of the Samza job. Step-by-step instructions to run the job are provided in the javadocs of the job. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 opened a new pull request #70: SAMZA-2433: Use log4j2 in samza-hello-samza
cameronlee314 opened a new pull request #70: SAMZA-2433: Use log4j2 in samza-hello-samza URL: https://github.com/apache/samza-hello-samza/pull/70 Issues: Log4j v1 is EOL and log4j2 is generally more performant and has a better module structure. Also, it is easier to handle split deployment functionality when using log4j2. Changes: Removed log4j v1 dependencies and added log4j2 dependencies (for both Gradle and Maven) Added log4j2.xml and removed log4j.xml Cleaned up some of the dependency specifications in order to make it easier to properly exclude log4j1 and include log4j2 Tests: Ran WikipediaApplication for a Gradle build and for a Maven build; verified that the logs showed up properly and that the job had output data. API Changes: None Upgrade Instructions: None Usage Instructions: No changes to existing build/deployment flows This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] shanthoosh commented on issue #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published
shanthoosh commented on issue #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published URL: https://github.com/apache/samza-hello-samza/pull/69#issuecomment-574340389 @kw2542 Thanks for the changes. Merged the patch to trunk. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] shanthoosh merged pull request #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published
shanthoosh merged pull request #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published URL: https://github.com/apache/samza-hello-samza/pull/69 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] kw2542 commented on issue #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published
kw2542 commented on issue #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published URL: https://github.com/apache/samza-hello-samza/pull/69#issuecomment-574309641 Updated gradle.properties. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] shanthoosh commented on issue #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published
shanthoosh commented on issue #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published URL: https://github.com/apache/samza-hello-samza/pull/69#issuecomment-573990991 Hadoop version used in OSS samza is 2.7.1 & here's it's set to 2.6.1 in gradle.properties. Would be better to bump-up the kafka-version to match with recent samza release. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] kw2542 opened a new pull request #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published
kw2542 opened a new pull request #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published URL: https://github.com/apache/samza-hello-samza/pull/69 In order to be compatible with Hello Samza documentation, POM needs to be updated to 1.4.0-SNAPSHOT for latest branch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] xinyuiscool merged pull request #68: Update dependencies to match OSS documentation
xinyuiscool merged pull request #68: Update dependencies to match OSS documentation URL: https://github.com/apache/samza-hello-samza/pull/68 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] kw2542 opened a new pull request #68: Update dependencies to match OSS documentation
kw2542 opened a new pull request #68: Update dependencies to match OSS documentation URL: https://github.com/apache/samza-hello-samza/pull/68 1. In https://samza.apache.org/startup/hello-samza/latest/, we are supposed to pull 1.3.0-SNAPSHOT of samza to create a hello samza 1.3.0-SNAPSHOT, updating pom.xml to match 2. Use 2.9.2 of Hadoop Yarn which supports HttpFileSystem This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #67: SAMZA-2285 - Sample async application for hello-samza
cameronlee314 commented on a change in pull request #67: SAMZA-2285 - Sample async application for hello-samza URL: https://github.com/apache/samza-hello-samza/pull/67#discussion_r313183986 ## File path: bin/run-wikipedia-async-application.sh ## @@ -0,0 +1,30 @@ +#!/bin/bash +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +home_dir=`pwd` Review comment: Do you need this script? It looks like most of the other examples don't have this, since they can be deployed using the `run-app.sh` script (see README). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #67: SAMZA-2285 - Sample async application for hello-samza
cameronlee314 commented on a change in pull request #67: SAMZA-2285 - Sample async application for hello-samza URL: https://github.com/apache/samza-hello-samza/pull/67#discussion_r313184179 ## File path: src/main/config/wikipedia-async-application.properties ## @@ -0,0 +1,58 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# Job +job.name=wikipedia-async-application +job.coordinator.factory=org.apache.samza.zk.ZkJobCoordinatorFactory +job.default.system=kafka +job.coordinator.zk.connect=localhost:2181 + +# Task/Application +task.name.grouper.factory=org.apache.samza.container.grouper.task.GroupByContainerIdsFactory + +# Serializers +serializers.registry.string.class=org.apache.samza.serializers.StringSerdeFactory +serializers.registry.integer.class=org.apache.samza.serializers.IntegerSerdeFactory + +# Wikipedia System +systems.wikipedia.samza.factory=samza.examples.wikipedia.system.WikipediaSystemFactory +systems.wikipedia.host=irc.wikimedia.org +systems.wikipedia.port=6667 + +# Kafka System +systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory +systems.kafka.consumer.zookeeper.connect=localhost:2181/ +systems.kafka.producer.bootstrap.servers=localhost:9092 +systems.kafka.default.stream.replication.factor=1 + +# Streams +streams.en-wikipedia.samza.system=wikipedia +streams.en-wikipedia.samza.physical.name=#en.wikipedia + +streams.en-wiktionary.samza.system=wikipedia +streams.en-wiktionary.samza.physical.name=#en.wiktionary + +streams.en-wikinews.samza.system=wikipedia +streams.en-wikinews.samza.physical.name=#en.wikinews + +task.max.concurrency=20 + +app.class=samza.examples.wikipedia.application.WikipediaAsyncApplication +job.factory.class=org.apache.samza.job.yarn.YarnJobFactory +job.container.count=1 + +yarn.package.path=file:///Users/bkumaras/workspace-common/hello-samza/target/hello-samza-1.0.1-SNAPSHOT-dist.tar.gz Review comment: Please don't include your specific user workspace. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] mynameborat opened a new pull request #67: SAMZA-2285 - Sample async application for hello-samza
mynameborat opened a new pull request #67: SAMZA-2285 - Sample async application for hello-samza URL: https://github.com/apache/samza-hello-samza/pull/67 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] rmatharu closed pull request #60: Updating after Bharath's scala 2.11 change, upgrading YARN version to match samza's yarn version
rmatharu closed pull request #60: Updating after Bharath's scala 2.11 change, upgrading YARN version to match samza's yarn version URL: https://github.com/apache/samza-hello-samza/pull/60 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] rmatharu commented on issue #62: Adding support for lxc on yarn for Samza
rmatharu commented on issue #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#issuecomment-503661672 Address most comments. Have put the contents of this in a gist here: https://gist.github.com/rmatharu/5d09e942aa7c38c14c5ff79283afc06e Can re-open if it needs to be checked in. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] rmatharu commented on a change in pull request #62: Adding support for lxc on yarn for Samza
rmatharu commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r295119154 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() +{ +local ip=$1 +local stat=1 + +if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then +OIFS=$IFS +IFS='.' +ip=($ip) +IFS=$OIFS +[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \ +&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]] +stat=$? +fi +return $stat +} + +function check_OS() +{ + #Check if OS is linux + if [[ "$OSTYPE" == "linux-gnu" ]]; then + echo "OS check passed." + else + echo "Only RHEL-Linux is currently supported for this setup. Exiting ..." + exit 0 + fi +} + + +function lxc_setup() +{ + + #Install LXC (and its dependencies) + echo "Beginning installation. Installing lxc on your machine" + sudo yum -y install epel-release + sudo yum -y install lxc lxc-templates libcap-devel libcgroup wget bridge-utils lxc-extra --skip-broken + echo "LXC installation complete." + + + lxcInterface="" + gatewayIP="" + +for interface in ${POSSIBLE_LXC_INTERFACES[@]} +do +echo "Checking if $interface is valid" +ip_address=`ip addr show $interface | grep "inet\b" | awk '{print $2}' | cut -d/ -f1` + +if valid_ip $ip_address; then +echo "Interface $interface is valid for using with LXC instances." +lxcInterface=$interface +gatewayIP=$ip_address +break; +else +echo "Interface $interface does not appear to be valid." +continue; +fi +done + + if [[ -z "$lxcInterface" ]]; then + echo "Did not find a valid network interface for use with LXC. Install LXC manually (https://linuxcontainers.org/lxc/getting-started/) and re-run." + exit 0 + fi + + #Print the valid interface found + echo "Using interface "$lxcInterface "($gatewayIP) for use with LXC" + + # Create shared directory for sharing between base machine and LXC-instances + echo "Creating dir $SHARED_LXC_DIR to be shared between base machine and LXC-instances" + sudo mkdir -p $SHARED_LXC_DIR && sudo chmod 777 $SHARED_LXC_DIR + + # Setting gateway IP address in conf/yarn-site.xml + echo "Setting yarn.resourcemanager.hostname="$gatewayIP in $YARN_SITE_XML + sed -i "/yarn.resourcemanager.hostname<\/name>/!b;n;c$gatewayIP" $YARN_SITE_XML + + # Adding RM bind host in conf/yarn-site.xml + echo "Setting yarn.resourcemanager.bind-host=0.0.0.0" in $YARN_SITE_XML + if [[ ! -z $(grep "yarn.resourcemanager.bind-host" conf/yarn-site.xml) ]]; then + + # Setting yarn.resourcemanager.bind-host to 0.0.0.0 + sed -i "/yarn.resourcemanager.bind-host<\/name>/!b;n;c0.0.0.0" $YARN_SITE_XML + else + + # Appending RM bind host in conf/yarn-site.xml + sed -i
[GitHub] [samza-hello-samza] rmatharu commented on a change in pull request #62: Adding support for lxc on yarn for Samza
rmatharu commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r295118327 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() Review comment: bash doesnt allow that This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza
vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293135338 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() +{ +local ip=$1 +local stat=1 + +if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then +OIFS=$IFS +IFS='.' +ip=($ip) +IFS=$OIFS +[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \ +&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]] +stat=$? +fi +return $stat +} + +function check_OS() +{ + #Check if OS is linux + if [[ "$OSTYPE" == "linux-gnu" ]]; then + echo "OS check passed." + else + echo "Only RHEL-Linux is currently supported for this setup. Exiting ..." + exit 0 + fi +} + + +function lxc_setup() +{ + + #Install LXC (and its dependencies) + echo "Beginning installation. Installing lxc on your machine" + sudo yum -y install epel-release + sudo yum -y install lxc lxc-templates libcap-devel libcgroup wget bridge-utils lxc-extra --skip-broken + echo "LXC installation complete." + + + lxcInterface="" + gatewayIP="" + +for interface in ${POSSIBLE_LXC_INTERFACES[@]} +do +echo "Checking if $interface is valid" +ip_address=`ip addr show $interface | grep "inet\b" | awk '{print $2}' | cut -d/ -f1` + +if valid_ip $ip_address; then +echo "Interface $interface is valid for using with LXC instances." +lxcInterface=$interface +gatewayIP=$ip_address +break; +else +echo "Interface $interface does not appear to be valid." +continue; +fi +done + + if [[ -z "$lxcInterface" ]]; then + echo "Did not find a valid network interface for use with LXC. Install LXC manually (https://linuxcontainers.org/lxc/getting-started/) and re-run." + exit 0 + fi + + #Print the valid interface found + echo "Using interface "$lxcInterface "($gatewayIP) for use with LXC" + + # Create shared directory for sharing between base machine and LXC-instances + echo "Creating dir $SHARED_LXC_DIR to be shared between base machine and LXC-instances" + sudo mkdir -p $SHARED_LXC_DIR && sudo chmod 777 $SHARED_LXC_DIR + + # Setting gateway IP address in conf/yarn-site.xml + echo "Setting yarn.resourcemanager.hostname="$gatewayIP in $YARN_SITE_XML + sed -i "/yarn.resourcemanager.hostname<\/name>/!b;n;c$gatewayIP" $YARN_SITE_XML + + # Adding RM bind host in conf/yarn-site.xml + echo "Setting yarn.resourcemanager.bind-host=0.0.0.0" in $YARN_SITE_XML + if [[ ! -z $(grep "yarn.resourcemanager.bind-host" conf/yarn-site.xml) ]]; then + + # Setting yarn.resourcemanager.bind-host to 0.0.0.0 + sed -i "/yarn.resourcemanager.bind-host<\/name>/!b;n;c0.0.0.0" $YARN_SITE_XML + else + + # Appending RM bind host in conf/yarn-site.xml + sed -i
[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza
vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293126356 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() +{ Review comment: format functions consistently as rest of the files: - eg: open parenthesis on the same line as the declaration, 2 space indents everywhere This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza
vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293128829 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() +{ +local ip=$1 +local stat=1 + +if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then +OIFS=$IFS +IFS='.' +ip=($ip) +IFS=$OIFS +[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \ +&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]] +stat=$? +fi +return $stat +} + +function check_OS() +{ + #Check if OS is linux + if [[ "$OSTYPE" == "linux-gnu" ]]; then + echo "OS check passed." + else + echo "Only RHEL-Linux is currently supported for this setup. Exiting ..." + exit 0 + fi +} + + +function lxc_setup() +{ + + #Install LXC (and its dependencies) + echo "Beginning installation. Installing lxc on your machine" + sudo yum -y install epel-release + sudo yum -y install lxc lxc-templates libcap-devel libcgroup wget bridge-utils lxc-extra --skip-broken + echo "LXC installation complete." + + + lxcInterface="" + gatewayIP="" + +for interface in ${POSSIBLE_LXC_INTERFACES[@]} +do +echo "Checking if $interface is valid" +ip_address=`ip addr show $interface | grep "inet\b" | awk '{print $2}' | cut -d/ -f1` + +if valid_ip $ip_address; then +echo "Interface $interface is valid for using with LXC instances." +lxcInterface=$interface +gatewayIP=$ip_address +break; +else +echo "Interface $interface does not appear to be valid." +continue; +fi +done + + if [[ -z "$lxcInterface" ]]; then + echo "Did not find a valid network interface for use with LXC. Install LXC manually (https://linuxcontainers.org/lxc/getting-started/) and re-run." + exit 0 Review comment: exit with non-zero code since this is an unexpected result This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza
vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293128724 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() +{ +local ip=$1 +local stat=1 + +if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then +OIFS=$IFS +IFS='.' +ip=($ip) +IFS=$OIFS +[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \ +&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]] +stat=$? +fi +return $stat +} + +function check_OS() +{ + #Check if OS is linux + if [[ "$OSTYPE" == "linux-gnu" ]]; then + echo "OS check passed." + else + echo "Only RHEL-Linux is currently supported for this setup. Exiting ..." + exit 0 + fi +} + + +function lxc_setup() +{ + + #Install LXC (and its dependencies) + echo "Beginning installation. Installing lxc on your machine" + sudo yum -y install epel-release + sudo yum -y install lxc lxc-templates libcap-devel libcgroup wget bridge-utils lxc-extra --skip-broken + echo "LXC installation complete." + + + lxcInterface="" + gatewayIP="" + +for interface in ${POSSIBLE_LXC_INTERFACES[@]} +do +echo "Checking if $interface is valid" +ip_address=`ip addr show $interface | grep "inet\b" | awk '{print $2}' | cut -d/ -f1` + +if valid_ip $ip_address; then +echo "Interface $interface is valid for using with LXC instances." +lxcInterface=$interface +gatewayIP=$ip_address +break; +else +echo "Interface $interface does not appear to be valid." +continue; Review comment: this loop will continue anyways to check for the next interface. don't think you need a continue here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza
vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293127785 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() +{ +local ip=$1 +local stat=1 + +if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then +OIFS=$IFS +IFS='.' +ip=($ip) +IFS=$OIFS +[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \ +&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]] +stat=$? +fi +return $stat +} + +function check_OS() +{ + #Check if OS is linux + if [[ "$OSTYPE" == "linux-gnu" ]]; then + echo "OS check passed." + else + echo "Only RHEL-Linux is currently supported for this setup. Exiting ..." + exit 0 + fi +} + + +function lxc_setup() +{ + + #Install LXC (and its dependencies) + echo "Beginning installation. Installing lxc on your machine" + sudo yum -y install epel-release + sudo yum -y install lxc lxc-templates libcap-devel libcgroup wget bridge-utils lxc-extra --skip-broken + echo "LXC installation complete." Review comment: capitalize consistently lxc vs LXC This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza
vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293127513 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() +{ +local ip=$1 +local stat=1 + +if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then +OIFS=$IFS +IFS='.' +ip=($ip) +IFS=$OIFS +[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \ +&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]] +stat=$? +fi +return $stat +} + +function check_OS() +{ + #Check if OS is linux + if [[ "$OSTYPE" == "linux-gnu" ]]; then + echo "OS check passed." + else + echo "Only RHEL-Linux is currently supported for this setup. Exiting ..." + exit 0 + fi +} + + +function lxc_setup() Review comment: prefer functions to start with verbs eg: setup_lxc This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza
vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293128114 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() +{ +local ip=$1 +local stat=1 + +if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then +OIFS=$IFS +IFS='.' +ip=($ip) +IFS=$OIFS +[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \ +&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]] +stat=$? +fi +return $stat +} + +function check_OS() +{ + #Check if OS is linux + if [[ "$OSTYPE" == "linux-gnu" ]]; then + echo "OS check passed." + else + echo "Only RHEL-Linux is currently supported for this setup. Exiting ..." + exit 0 + fi +} + + +function lxc_setup() +{ + + #Install LXC (and its dependencies) + echo "Beginning installation. Installing lxc on your machine" + sudo yum -y install epel-release + sudo yum -y install lxc lxc-templates libcap-devel libcgroup wget bridge-utils lxc-extra --skip-broken + echo "LXC installation complete." + + + lxcInterface="" Review comment: iirc, camel-case convention for variable names is java-only. use under-scores here eg: lxc_interface This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza
vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293131208 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() +{ +local ip=$1 +local stat=1 + +if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then +OIFS=$IFS +IFS='.' +ip=($ip) +IFS=$OIFS +[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \ +&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]] +stat=$? +fi +return $stat +} + +function check_OS() +{ + #Check if OS is linux + if [[ "$OSTYPE" == "linux-gnu" ]]; then + echo "OS check passed." + else + echo "Only RHEL-Linux is currently supported for this setup. Exiting ..." + exit 0 + fi +} + + +function lxc_setup() +{ + + #Install LXC (and its dependencies) + echo "Beginning installation. Installing lxc on your machine" + sudo yum -y install epel-release + sudo yum -y install lxc lxc-templates libcap-devel libcgroup wget bridge-utils lxc-extra --skip-broken + echo "LXC installation complete." + + + lxcInterface="" + gatewayIP="" + +for interface in ${POSSIBLE_LXC_INTERFACES[@]} +do +echo "Checking if $interface is valid" +ip_address=`ip addr show $interface | grep "inet\b" | awk '{print $2}' | cut -d/ -f1` + +if valid_ip $ip_address; then +echo "Interface $interface is valid for using with LXC instances." +lxcInterface=$interface +gatewayIP=$ip_address +break; +else +echo "Interface $interface does not appear to be valid." +continue; +fi +done + + if [[ -z "$lxcInterface" ]]; then + echo "Did not find a valid network interface for use with LXC. Install LXC manually (https://linuxcontainers.org/lxc/getting-started/) and re-run." + exit 0 + fi + + #Print the valid interface found + echo "Using interface "$lxcInterface "($gatewayIP) for use with LXC" + + # Create shared directory for sharing between base machine and LXC-instances + echo "Creating dir $SHARED_LXC_DIR to be shared between base machine and LXC-instances" + sudo mkdir -p $SHARED_LXC_DIR && sudo chmod 777 $SHARED_LXC_DIR + + # Setting gateway IP address in conf/yarn-site.xml + echo "Setting yarn.resourcemanager.hostname="$gatewayIP in $YARN_SITE_XML + sed -i "/yarn.resourcemanager.hostname<\/name>/!b;n;c$gatewayIP" $YARN_SITE_XML + + # Adding RM bind host in conf/yarn-site.xml + echo "Setting yarn.resourcemanager.bind-host=0.0.0.0" in $YARN_SITE_XML + if [[ ! -z $(grep "yarn.resourcemanager.bind-host" conf/yarn-site.xml) ]]; then + + # Setting yarn.resourcemanager.bind-host to 0.0.0.0 + sed -i "/yarn.resourcemanager.bind-host<\/name>/!b;n;c0.0.0.0" $YARN_SITE_XML + else + + # Appending RM bind host in conf/yarn-site.xml + sed -i
[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza
vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293127099 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() +{ +local ip=$1 +local stat=1 + +if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then +OIFS=$IFS +IFS='.' +ip=($ip) +IFS=$OIFS +[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \ +&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]] +stat=$? +fi +return $stat +} + +function check_OS() +{ + #Check if OS is linux + if [[ "$OSTYPE" == "linux-gnu" ]]; then + echo "OS check passed." + else + echo "Only RHEL-Linux is currently supported for this setup. Exiting ..." + exit 0 Review comment: should you exit with a non-zero code? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza
vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293130994 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() +{ +local ip=$1 +local stat=1 + +if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then +OIFS=$IFS +IFS='.' +ip=($ip) +IFS=$OIFS +[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \ +&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]] +stat=$? +fi +return $stat +} + +function check_OS() +{ + #Check if OS is linux + if [[ "$OSTYPE" == "linux-gnu" ]]; then + echo "OS check passed." + else + echo "Only RHEL-Linux is currently supported for this setup. Exiting ..." + exit 0 + fi +} + + +function lxc_setup() +{ + + #Install LXC (and its dependencies) + echo "Beginning installation. Installing lxc on your machine" + sudo yum -y install epel-release + sudo yum -y install lxc lxc-templates libcap-devel libcgroup wget bridge-utils lxc-extra --skip-broken + echo "LXC installation complete." + + + lxcInterface="" + gatewayIP="" + +for interface in ${POSSIBLE_LXC_INTERFACES[@]} +do +echo "Checking if $interface is valid" +ip_address=`ip addr show $interface | grep "inet\b" | awk '{print $2}' | cut -d/ -f1` + +if valid_ip $ip_address; then +echo "Interface $interface is valid for using with LXC instances." +lxcInterface=$interface +gatewayIP=$ip_address +break; +else +echo "Interface $interface does not appear to be valid." +continue; +fi +done + + if [[ -z "$lxcInterface" ]]; then + echo "Did not find a valid network interface for use with LXC. Install LXC manually (https://linuxcontainers.org/lxc/getting-started/) and re-run." + exit 0 + fi + + #Print the valid interface found + echo "Using interface "$lxcInterface "($gatewayIP) for use with LXC" + + # Create shared directory for sharing between base machine and LXC-instances + echo "Creating dir $SHARED_LXC_DIR to be shared between base machine and LXC-instances" + sudo mkdir -p $SHARED_LXC_DIR && sudo chmod 777 $SHARED_LXC_DIR + + # Setting gateway IP address in conf/yarn-site.xml + echo "Setting yarn.resourcemanager.hostname="$gatewayIP in $YARN_SITE_XML + sed -i "/yarn.resourcemanager.hostname<\/name>/!b;n;c$gatewayIP" $YARN_SITE_XML + + # Adding RM bind host in conf/yarn-site.xml + echo "Setting yarn.resourcemanager.bind-host=0.0.0.0" in $YARN_SITE_XML + if [[ ! -z $(grep "yarn.resourcemanager.bind-host" conf/yarn-site.xml) ]]; then + + # Setting yarn.resourcemanager.bind-host to 0.0.0.0 + sed -i "/yarn.resourcemanager.bind-host<\/name>/!b;n;c0.0.0.0" $YARN_SITE_XML + else + + # Appending RM bind host in conf/yarn-site.xml + sed -i
[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza
vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293127151 ## File path: bin/setup-lxc ## @@ -0,0 +1,345 @@ +#!/bin/bash -e +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# This script will download, setup, start, and stop servers for Kafka, YARN, and ZooKeeper, +# as well as downloading, building and locally publishing Samza + + +COMMAND=$1 +ARG0=$2 +ARG1=$3 + +SHARED_LXC_DIR=/lxc-shared +POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0) +YARN_SITE_XML=conf/yarn-site.xml +NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms variable +LXC_INSTANCE_TYPE="fedora" +LXC_ROOTFS_DIR=/var/lib/lxc +LXC_INSTANCE_START_NM_SCRIPT=startNodeManager + +RESOLV_CONF_FILE=/etc/resolv.conf + +# Helper function to test an IP address for validity: +# Usage: +# valid_ip IP_ADDRESS +# if [[ $? -eq 0 ]]; then echo good; else echo bad; fi +# OR +# if valid_ip IP_ADDRESS; then echo good; else echo bad; fi +# +function valid_ip() Review comment: move helper to end of file This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] sborya merged pull request #66: update to samza 1.2.0
sborya merged pull request #66: update to samza 1.2.0 URL: https://github.com/apache/samza-hello-samza/pull/66 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] sborya opened a new pull request #66: update to samza 1.2.0
sborya opened a new pull request #66: update to samza 1.2.0 URL: https://github.com/apache/samza-hello-samza/pull/66 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] weisong44 merged pull request #65: SAMZA-2223: Update Couchbase example to use NoOpTableReadFunction
weisong44 merged pull request #65: SAMZA-2223: Update Couchbase example to use NoOpTableReadFunction URL: https://github.com/apache/samza-hello-samza/pull/65 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] weisong44 opened a new pull request #65: SAMZA-2223: Update Couchbase example to use NoOpTableReadFunction
weisong44 opened a new pull request #65: SAMZA-2223: Update Couchbase example to use NoOpTableReadFunction URL: https://github.com/apache/samza-hello-samza/pull/65 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] weisong44 merged pull request #64: SAMZA-2218: Add a Couchbase example to samza-hello-samza
weisong44 merged pull request #64: SAMZA-2218: Add a Couchbase example to samza-hello-samza URL: https://github.com/apache/samza-hello-samza/pull/64 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] weisong44 commented on a change in pull request #64: SAMZA-2218: Add a Couchbase example to samza-hello-samza
weisong44 commented on a change in pull request #64: SAMZA-2218: Add a Couchbase example to samza-hello-samza URL: https://github.com/apache/samza-hello-samza/pull/64#discussion_r287457502 ## File path: src/main/java/samza/examples/cookbook/CouchbaseTableExample.java ## @@ -0,0 +1,259 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package samza.examples.cookbook; + +import com.couchbase.client.java.document.json.JsonObject; +import com.google.common.base.Preconditions; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import java.text.SimpleDateFormat; +import java.time.Duration; +import java.util.Arrays; +import java.util.Date; +import java.util.List; +import java.util.Map; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.TimeUnit; +import org.apache.samza.SamzaException; +import org.apache.samza.application.StreamApplication; +import org.apache.samza.application.descriptors.StreamApplicationDescriptor; +import org.apache.samza.context.Context; +import org.apache.samza.operators.MessageStream; +import org.apache.samza.operators.OutputStream; +import org.apache.samza.operators.functions.MapFunction; +import org.apache.samza.serializers.StringSerde; +import org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor; +import org.apache.samza.system.kafka.descriptors.KafkaOutputDescriptor; +import org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor; +import org.apache.samza.table.descriptors.RemoteTableDescriptor; +import org.apache.samza.table.remote.BaseTableFunction; +import org.apache.samza.table.remote.RemoteTable; +import org.apache.samza.table.remote.TableReadFunction; +import org.apache.samza.table.remote.couchbase.CouchbaseTableWriteFunction; +import org.apache.samza.table.retry.TableRetryPolicy; + + +/** + * This is a simple word count example using a remote store. + * + * In this example, we use Couchbase to demonstrate how to invoke API's on a remote store other than get, put or delete + * as defined in {@link org.apache.samza.table.remote.AsyncRemoteTable}. Input messages are collected from user through + * a Kafka console producer, and tokenized using space. For each word, we increment a counter for this word + * as well as a counter for all words on Couchbase. We also output the current value of both counters to Kafka console + * consumer. + * + * A rate limit of 4 requests/second to Couchbase is set of the entire job, internally Samza uses an embedded + * rate limiter, which evenly distributes the total rate limit among tasks. As we invoke 2 calls on Couchbase + * for each word, you should see roughly 2 messages per second in the Kafka console consumer + * window. + * + * A retry policy with 1 second fixed backoff time and max 3 retries is attached to the remote table. + * + * Concepts covered: remote table, rate limiter, retry, arbitrary operation on remote store. + * + * To run the below example: + * + * + * + * Create a Couchbase instance using docker; Log into the admin UI at http://localhost:8091 (Administrator/password) + * create a bucket called "my-bucket" + * Under Security tab, create a user with the same name, set 123456 as the password, and give it "Data Reader" + * and "Data Writer" privilege for this bucket. + * More information can be found at https://docs.couchbase.com/server/current/getting-started/do-a-quick-install.html + * + * + * Create Kafka topics "word-input" and "count-output" + * ./deploy/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic word-input --partitions 2 --replication-factor 1 + * ./deploy/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic count-output --partitions 2 --replication-factor 1 + * + * + * Run the application using the run-app.sh script + * ./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/couchbase-table-example.properties + * + * + * Consume messages from the output topic + * ./deploy/kafka/bin/kafka-console-consumer.sh
[GitHub] [samza-hello-samza] dengpanyin commented on a change in pull request #64: SAMZA-2218: Add a Couchbase example to samza-hello-samza
dengpanyin commented on a change in pull request #64: SAMZA-2218: Add a Couchbase example to samza-hello-samza URL: https://github.com/apache/samza-hello-samza/pull/64#discussion_r287431784 ## File path: src/main/java/samza/examples/cookbook/CouchbaseTableExample.java ## @@ -0,0 +1,259 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package samza.examples.cookbook; + +import com.couchbase.client.java.document.json.JsonObject; +import com.google.common.base.Preconditions; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import java.text.SimpleDateFormat; +import java.time.Duration; +import java.util.Arrays; +import java.util.Date; +import java.util.List; +import java.util.Map; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.TimeUnit; +import org.apache.samza.SamzaException; +import org.apache.samza.application.StreamApplication; +import org.apache.samza.application.descriptors.StreamApplicationDescriptor; +import org.apache.samza.context.Context; +import org.apache.samza.operators.MessageStream; +import org.apache.samza.operators.OutputStream; +import org.apache.samza.operators.functions.MapFunction; +import org.apache.samza.serializers.StringSerde; +import org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor; +import org.apache.samza.system.kafka.descriptors.KafkaOutputDescriptor; +import org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor; +import org.apache.samza.table.descriptors.RemoteTableDescriptor; +import org.apache.samza.table.remote.BaseTableFunction; +import org.apache.samza.table.remote.RemoteTable; +import org.apache.samza.table.remote.TableReadFunction; +import org.apache.samza.table.remote.couchbase.CouchbaseTableWriteFunction; +import org.apache.samza.table.retry.TableRetryPolicy; + + +/** + * This is a simple word count example using a remote store. + * + * In this example, we use Couchbase to demonstrate how to invoke API's on a remote store other than get, put or delete + * as defined in {@link org.apache.samza.table.remote.AsyncRemoteTable}. Input messages are collected from user through + * a Kafka console producer, and tokenized using space. For each word, we increment a counter for this word + * as well as a counter for all words on Couchbase. We also output the current value of both counters to Kafka console + * consumer. + * + * A rate limit of 4 requests/second to Couchbase is set of the entire job, internally Samza uses an embedded + * rate limiter, which evenly distributes the total rate limit among tasks. As we invoke 2 calls on Couchbase + * for each word, you should see roughly 2 messages per second in the Kafka console consumer + * window. + * + * A retry policy with 1 second fixed backoff time and max 3 retries is attached to the remote table. + * + * Concepts covered: remote table, rate limiter, retry, arbitrary operation on remote store. + * + * To run the below example: + * + * + * + * Create a Couchbase instance using docker; Log into the admin UI at http://localhost:8091 (Administrator/password) + * create a bucket called "my-bucket" + * Under Security tab, create a user with the same name, set 123456 as the password, and give it "Data Reader" + * and "Data Writer" privilege for this bucket. + * More information can be found at https://docs.couchbase.com/server/current/getting-started/do-a-quick-install.html + * + * + * Create Kafka topics "word-input" and "count-output" + * ./deploy/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic word-input --partitions 2 --replication-factor 1 + * ./deploy/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic count-output --partitions 2 --replication-factor 1 + * + * + * Run the application using the run-app.sh script + * ./deploy/samza/bin/run-app.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/couchbase-table-example.properties + * + * + * Consume messages from the output topic + * ./deploy/kafka/bin/kafka-console-consumer.sh
[GitHub] [samza-hello-samza] weisong44 opened a new pull request #64: SAMZA-2218: Add a Couchbase example to samza-hello-samza
weisong44 opened a new pull request #64: SAMZA-2218: Add a Couchbase example to samza-hello-samza URL: https://github.com/apache/samza-hello-samza/pull/64 As per subject This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] weisong44 merged pull request #63: Fixed build failure due to changes in Samza project
weisong44 merged pull request #63: Fixed build failure due to changes in Samza project URL: https://github.com/apache/samza-hello-samza/pull/63 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza-hello-samza] weisong44 opened a new pull request #63: Fixed build failure due to changes in Samza project
weisong44 opened a new pull request #63: Fixed build failure due to changes in Samza project URL: https://github.com/apache/samza-hello-samza/pull/63 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services