[GitHub] [samza-hello-samza] james-deee opened a new pull request, #87: Upgrade hello-samza to latest deps and prepare for Java 11

2022-08-18 Thread GitBox


james-deee opened a new pull request, #87:
URL: https://github.com/apache/samza-hello-samza/pull/87

   This is preparing things so that this app will work with Java 11. Once this 
Samza app PR is merged and released, we can update this to use the that version 
in the pom.xml file. 
   
   I am also deleting the gradle artifacts because this whole project is 
actually using Maven (not gradle, which was version 2 or something). 
   
   Verified that these version bumps (including the use of Scala 2.12 from 
Samza) all work. Note this moves Hadoop to be a Java 11 compatible version as 
well. 
   
   I have also tested with my local build from the Samza PR above, and using 
Java 11 that all of this works as well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza-hello-samza] dependabot[bot] opened a new pull request, #85: Bump hadoop-common from 2.6.1 to 3.2.3

2022-04-12 Thread GitBox


dependabot[bot] opened a new pull request, #85:
URL: https://github.com/apache/samza-hello-samza/pull/85

   Bumps hadoop-common from 2.6.1 to 3.2.3.
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.hadoop:hadoop-common=maven=2.6.1=3.2.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   - `@dependabot use these labels` will set the current labels as the default 
for future PRs for this repo and language
   - `@dependabot use these reviewers` will set the current reviewers as the 
default for future PRs for this repo and language
   - `@dependabot use these assignees` will set the current assignees as the 
default for future PRs for this repo and language
   - `@dependabot use this milestone` will set the current milestone as the 
default for future PRs for this repo and language
   
   You can disable automated security fix PRs for this repo from the [Security 
Alerts page](https://github.com/apache/samza-hello-samza/network/alerts).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza-hello-samza] naleon opened a new pull request #84: [FIX] version jetty transitive dependency.

2021-07-11 Thread GitBox


naleon opened a new pull request #84:
URL: https://github.com/apache/samza-hello-samza/pull/84


   I got "java.lang.NoClassDefFoundError: 
org/eclipse/jetty/http/HttpCookie$SetCookieHttpField" at the time I tried 
deployed hello-samza-1.6.0-dist.tar.gz
   This dependency exclusion fixes it.
   You can see it in the dependency tree.
   
   ```
   [INFO] 
   [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hello-samza ---
   [INFO] org.apache.samza:hello-samza:jar:1.6.0
   [INFO] +- junit:junit:jar:4.12:compile
   [INFO] |  \- org.hamcrest:hamcrest-core:jar:1.3:compile
   [INFO] +- org.apache.samza:samza-api:jar:1.6.0:compile
   [INFO] |  +- org.apache.commons:commons-lang3:jar:3.4:compile
   [INFO] |  +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile
   [INFO] |  +- com.google.code.gson:gson:jar:2.8.5:compile
   [INFO] |  \- io.dropwizard.metrics:metrics-core:jar:3.1.2:compile
   [INFO] +- org.apache.samza:samza-azure_2.11:jar:1.6.0:compile
   [INFO] |  +- com.azure:azure-storage-blob:jar:12.0.1:compile
   [INFO] |  |  +- com.azure:azure-core:jar:1.0.0:compile
   [INFO] |  |  |  +- 
com.fasterxml.jackson.datatype:jackson-datatype-jsr310:jar:2.10.0:compile
   [INFO] |  |  |  +- 
com.fasterxml.jackson.dataformat:jackson-dataformat-xml:jar:2.10.0:compile
   [INFO] |  |  |  |  +- 
com.fasterxml.jackson.module:jackson-module-jaxb-annotations:jar:2.10.0:compile
   [INFO] |  |  |  |  |  +- 
jakarta.xml.bind:jakarta.xml.bind-api:jar:2.3.2:compile
   [INFO] |  |  |  |  |  \- 
jakarta.activation:jakarta.activation-api:jar:1.2.1:compile
   [INFO] |  |  |  |  +- org.codehaus.woodstox:stax2-api:jar:4.2:compile
   [INFO] |  |  |  |  \- com.fasterxml.woodstox:woodstox-core:jar:6.0.1:compile
   [INFO] |  |  |  \- io.projectreactor:reactor-core:jar:3.3.0.RELEASE:compile
   [INFO] |  |  | \- org.reactivestreams:reactive-streams:jar:1.0.3:compile
   [INFO] |  |  \- com.azure:azure-storage-common:jar:12.0.1:compile
   [INFO] |  | \- com.azure:azure-core-http-netty:jar:1.0.0:compile
   [INFO] |  |+- io.netty:netty-handler:jar:4.1.42.Final:compile
   [INFO] |  ||  +- io.netty:netty-common:jar:4.1.42.Final:compile
   [INFO] |  ||  +- io.netty:netty-transport:jar:4.1.42.Final:compile
   [INFO] |  ||  |  \- io.netty:netty-resolver:jar:4.1.42.Final:compile
   [INFO] |  ||  \- io.netty:netty-codec:jar:4.1.42.Final:compile
   [INFO] |  |+- io.netty:netty-handler-proxy:jar:4.1.42.Final:compile
   [INFO] |  ||  \- io.netty:netty-codec-socks:jar:4.1.42.Final:compile
   [INFO] |  |+- io.netty:netty-buffer:jar:4.1.42.Final:compile
   [INFO] |  |+- io.netty:netty-codec-http:jar:4.1.42.Final:compile
   [INFO] |  |+- 
io.projectreactor.netty:reactor-netty:jar:0.9.0.RELEASE:compile
   [INFO] |  ||  +- io.netty:netty-codec-http2:jar:4.1.39.Final:compile
   [INFO] |  ||  +- 
io.netty:netty-transport-native-epoll:jar:linux-x86_64:4.1.39.Final:compile
   [INFO] |  ||  |  \- 
io.netty:netty-transport-native-unix-common:jar:4.1.39.Final:compile
   [INFO] |  ||  \- 
io.projectreactor.addons:reactor-pool:jar:0.1.0.RELEASE:compile
   [INFO] |  |\- com.azure:azure-core-test:jar:1.0.0:compile
   [INFO] |  |   \- 
io.projectreactor:reactor-test:jar:3.3.0.RELEASE:compile
   [INFO] |  +- com.microsoft.azure:azure-storage:jar:5.3.1:compile
   [INFO] |  |  \- com.microsoft.azure:azure-keyvault-core:jar:0.8.0:compile
   [INFO] |  +- com.microsoft.azure:azure-eventhubs:jar:1.0.1:compile
   [INFO] |  |  \- org.apache.qpid:proton-j:jar:0.25.0:compile
   [INFO] |  +- com.fasterxml.jackson.core:jackson-core:jar:2.10.0:compile
   [INFO] |  \- org.apache.avro:avro:jar:1.7.7:compile
   [INFO] | +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile
   [INFO] | \- org.xerial.snappy:snappy-java:jar:1.0.5:compile
   [INFO] +- org.apache.samza:samza-core_2.11:jar:1.6.0:compile
   [INFO] |  +- com.101tec:zkclient:jar:0.8:compile
   [INFO] |  +- net.sf.jopt-simple:jopt-simple:jar:5.0.4:compile
   [INFO] |  +- org.apache.commons:commons-collections4:jar:4.0:compile
   [INFO] |  +- commons-io:commons-io:jar:2.6:compile
   [INFO] |  +- org.eclipse.jetty:jetty-webapp:jar:9.4.20.v20190813:compile
   [INFO] |  |  +- org.eclipse.jetty:jetty-xml:jar:9.4.20.v20190813:compile
   [INFO] |  |  |  \- org.eclipse.jetty:jetty-util:jar:9.4.20.v20190813:compile
   [INFO] |  |  \- org.eclipse.jetty:jetty-servlet:jar:9.4.20.v20190813:compile
   [INFO] |  | \- 
org.eclipse.jetty:jetty-security:jar:9.4.20.v20190813:compile
   [INFO] |  |\- 
org.eclipse.jetty:jetty-server:jar:9.4.20.v20190813:compile
   [INFO] |  |   \- 
org.eclipse.jetty:jetty-io:jar:9.4.20.v20190813:compile
   [INFO] |  +- org.scala-lang:scala-library:jar:2.11.8:compile
   [INFO] |  +- net.jodah:failsafe:jar:1.1.0:compile
   [INFO] |  \- com.linkedin.cytodynamics:cytodynamics-nucleus:jar:0.2.0:compile
   [INFO] +- 

[GitHub] [samza-hello-samza] mynameborat merged pull request #83: remove comited merge conflict.

2021-04-01 Thread GitBox


mynameborat merged pull request #83:
URL: https://github.com/apache/samza-hello-samza/pull/83


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [samza-hello-samza] li-afaris opened a new pull request #83: remove comited merge conflict.

2021-04-01 Thread GitBox


li-afaris opened a new pull request #83:
URL: https://github.com/apache/samza-hello-samza/pull/83


   A merge conflict was checked in by mistake.  This change removes the 
conflict & changes the hello-samza version to 1.6


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [samza-beam-examples] xinyuiscool opened a new pull request #3: Update Beam to 2.27 and samza to 1.3

2021-02-08 Thread GitBox


xinyuiscool opened a new pull request #3:
URL: https://github.com/apache/samza-beam-examples/pull/3


   Update the beam examples on samza runner to use the latest beam version. 
   
   Note that we found problems when dealing with splitable parDo so we use the 
flag --experiments=use_deprecated_read to disable it for now.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [samza-hello-samza] mynameborat merged pull request #82: SAMZA-2586: Update samza-hello-samza#latest to use 1.6.0-SNAPSHOT

2020-08-27 Thread GitBox


mynameborat merged pull request #82:
URL: https://github.com/apache/samza-hello-samza/pull/82


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [samza-hello-samza] kw2542 opened a new pull request #82: SAMZA-2586: Update samza-hello-samza#latest to use 1.6.0-SNAPSHOT

2020-08-27 Thread GitBox


kw2542 opened a new pull request #82:
URL: https://github.com/apache/samza-hello-samza/pull/82


   Issues: 1.5.0-SNAPSHOT is used in latest branch, which does not work with 
the instruction http://samza.apache.org/startup/hello-samza/latest/
   Changes: Update to use 1.6.0-SNAPSHOT
   Tests: Deployed a job following instructions on 
http://samza.apache.org/startup/hello-samza/latest/



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [samza-hello-samza] mpfeiffer00 commented on pull request #81: Local topic from wikipedia

2020-08-18 Thread GitBox


mpfeiffer00 commented on pull request #81:
URL: https://github.com/apache/samza-hello-samza/pull/81#issuecomment-675805177


   Oh boy, looks pretty cool. I'm so glad it works this time.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [samza-hello-samza] mpfeiffer00 closed pull request #81: Local topic from wikipedia

2020-08-18 Thread GitBox


mpfeiffer00 closed pull request #81:
URL: https://github.com/apache/samza-hello-samza/pull/81


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [samza-hello-samza] mpfeiffer00 opened a new pull request #81: Local topic from wikipedia

2020-08-18 Thread GitBox


mpfeiffer00 opened a new pull request #81:
URL: https://github.com/apache/samza-hello-samza/pull/81


   Updated scripts and documentation to run a local kafka topic streamed from 
wikipedia.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [samza-hello-samza] mynameborat merged pull request #80: Merge latest branch into master and set version to 1.5.0

2020-07-30 Thread GitBox


mynameborat merged pull request #80:
URL: https://github.com/apache/samza-hello-samza/pull/80


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [samza-hello-samza] mynameborat opened a new pull request #80: Merge latest branch into master and set version to 1.5.0

2020-07-29 Thread GitBox


mynameborat opened a new pull request #80:
URL: https://github.com/apache/samza-hello-samza/pull/80


   - Changes related to job runner configurations
   - Version bump to 1.5.0



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-04-07 Thread GitBox
cameronlee314 commented on a change in pull request #79: Update doc and javadoc 
from config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404985923
 
 

 ##
 File path: gradle/wrapper/gradle-wrapper.properties
 ##
 @@ -1,6 +1,6 @@
-#Mon Mar 23 14:55:28 PDT 2015
+#Fri Mar 27 16:28:33 PDT 2020
+distributionUrl=https\://services.gradle.org/distributions/gradle-2.3-all.zip
 
 Review comment:
   It's pretty minor. It's ok to leave it in this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-04-06 Thread GitBox
kw2542 commented on a change in pull request #79: Update doc and javadoc from 
config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404502243
 
 

 ##
 File path: README.md
 ##
 @@ -61,7 +61,7 @@ Package 
[samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr
 Package 
[samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application)
 contains a small Samza application which consumes the real-time feeds from 
Wikipedia, extracts the metadata of the events, and calculates statistics of 
all edits in a 10-second window. You can start the app on the grid using the 
run-app.sh script:
 
 ```
-./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties
+./deploy/samza/bin/run-app.sh 
--config-path=$PWD/deploy/samza/config/wikipedia-application.properties
 
 Review comment:
   `LocalApplicationRunner` with `--config-path` is still submission config, no 
full config is needed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-04-06 Thread GitBox
kw2542 commented on a change in pull request #79: Update doc and javadoc from 
config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404501952
 
 

 ##
 File path: gradle/wrapper/gradle-wrapper.properties
 ##
 @@ -1,6 +1,6 @@
-#Mon Mar 23 14:55:28 PDT 2015
+#Fri Mar 27 16:28:33 PDT 2020
+distributionUrl=https\://services.gradle.org/distributions/gradle-2.3-all.zip
 
 Review comment:
   It is auto updated by the build tool. I think I can revert this to keep it 
separated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-04-06 Thread GitBox
kw2542 commented on a change in pull request #79: Update doc and javadoc from 
config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404501360
 
 

 ##
 File path: src/main/config/azure-blob-application.properties
 ##
 @@ -30,10 +30,14 @@ 
yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-
 # StreamApplication class
 app.class=samza.examples.azure.AzureBlobApplication
 
-#Azure blob essential configs
+# Azure blob essential configs
 
systems.azure-blob-container.samza.factory=org.apache.samza.system.azureblob.AzureBlobSystemFactory
 
sensitive.systems.azure-blob-container.azureblob.account.name=your-azure-storage-account-name
 
sensitive.systems.azure-blob-container.azureblob.account.key=your-azure-storage-account-key
 
+# Config Loader
+job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory
+job.config.loader.properties.path=./__package/config/azure-blob-application.properties
 
 Review comment:
   Yes, "__package" is Samza's internal implementation detail where localized 
job to be unzipped at.
   
   It is challenging to do it programmatically as clients may or may not 
include __package in the path, and different programs may have different 
locations to put their configs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-04-06 Thread GitBox
kw2542 commented on a change in pull request #79: Update doc and javadoc from 
config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404500446
 
 

 ##
 File path: src/main/config/azure-blob-application.properties
 ##
 @@ -30,10 +30,14 @@ 
yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-
 # StreamApplication class
 app.class=samza.examples.azure.AzureBlobApplication
 
-#Azure blob essential configs
+# Azure blob essential configs
 
systems.azure-blob-container.samza.factory=org.apache.samza.system.azureblob.AzureBlobSystemFactory
 
sensitive.systems.azure-blob-container.azureblob.account.name=your-azure-storage-account-name
 
sensitive.systems.azure-blob-container.azureblob.account.key=your-azure-storage-account-key
 
+# Config Loader
+job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory
+job.config.loader.properties.path=./__package/config/azure-blob-application.properties
 
 Review comment:
   Yes, `samza-hello-samza` build tarball already include config directory.
   
   Agree, the migration guide for 1.5 need detailed migration instructions on 
this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-04-06 Thread GitBox
cameronlee314 commented on a change in pull request #79: Update doc and javadoc 
from config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404293260
 
 

 ##
 File path: README.md
 ##
 @@ -61,7 +61,7 @@ Package 
[samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr
 Package 
[samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application)
 contains a small Samza application which consumes the real-time feeds from 
Wikipedia, extracts the metadata of the events, and calculates statistics of 
all edits in a 10-second window. You can start the app on the grid using the 
run-app.sh script:
 
 ```
-./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties
+./deploy/samza/bin/run-app.sh 
--config-path=$PWD/deploy/samza/config/wikipedia-application.properties
 
 Review comment:
   This `--config-path` here is for submission configs only, right? Can you 
think of a way to help clarify that? It looks like the example is suggesting to 
pass full job configs as submission configs. However, that could end up being a 
problem if the full job configs are too large (IIRC, there is a limit for the 
size of the submission configs env variable when passing to YARN).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-04-06 Thread GitBox
cameronlee314 commented on a change in pull request #79: Update doc and javadoc 
from config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404303771
 
 

 ##
 File path: src/main/config/azure-blob-application.properties
 ##
 @@ -30,10 +30,14 @@ 
yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-
 # StreamApplication class
 app.class=samza.examples.azure.AzureBlobApplication
 
-#Azure blob essential configs
+# Azure blob essential configs
 
systems.azure-blob-container.samza.factory=org.apache.samza.system.azureblob.AzureBlobSystemFactory
 
sensitive.systems.azure-blob-container.azureblob.account.name=your-azure-storage-account-name
 
sensitive.systems.azure-blob-container.azureblob.account.key=your-azure-storage-account-key
 
+# Config Loader
+job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory
+job.config.loader.properties.path=./__package/config/azure-blob-application.properties
 
 Review comment:
   If I understand correctly, in the past, it was not necessary to have this 
config file in the application package on the YARN containers. Was the 
`samza-hello-samza` build already set up to include the properties file into 
the application package?
   This may also be something you would need to call out in the migration 
documentation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-04-06 Thread GitBox
cameronlee314 commented on a change in pull request #79: Update doc and javadoc 
from config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404294708
 
 

 ##
 File path: gradle/wrapper/gradle-wrapper.properties
 ##
 @@ -1,6 +1,6 @@
-#Mon Mar 23 14:55:28 PDT 2015
+#Fri Mar 27 16:28:33 PDT 2020
+distributionUrl=https\://services.gradle.org/distributions/gradle-2.3-all.zip
 
 Review comment:
   This change is intentional for this PR, right? If so, it's ok to keep it; 
just double checking since it isn't quite related.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-04-06 Thread GitBox
cameronlee314 commented on a change in pull request #79: Update doc and javadoc 
from config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404307632
 
 

 ##
 File path: README.md
 ##
 @@ -61,7 +61,7 @@ Package 
[samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr
 Package 
[samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application)
 contains a small Samza application which consumes the real-time feeds from 
Wikipedia, extracts the metadata of the events, and calculates statistics of 
all edits in a 10-second window. You can start the app on the grid using the 
run-app.sh script:
 
 ```
-./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties
+./deploy/samza/bin/run-app.sh 
--config-path=$PWD/deploy/samza/config/wikipedia-application.properties
 
 Review comment:
   On the other hand, if someone switched this to use `LocalApplicationRunner`, 
then would the `--config-path` need to be the full configs? Would this 
overloading of the `--config-path` argument be confusing?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-04-06 Thread GitBox
cameronlee314 commented on a change in pull request #79: Update doc and javadoc 
from config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r404301906
 
 

 ##
 File path: src/main/config/azure-blob-application.properties
 ##
 @@ -30,10 +30,14 @@ 
yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-
 # StreamApplication class
 app.class=samza.examples.azure.AzureBlobApplication
 
-#Azure blob essential configs
+# Azure blob essential configs
 
systems.azure-blob-container.samza.factory=org.apache.samza.system.azureblob.AzureBlobSystemFactory
 
sensitive.systems.azure-blob-container.azureblob.account.name=your-azure-storage-account-name
 
sensitive.systems.azure-blob-container.azureblob.account.key=your-azure-storage-account-key
 
+# Config Loader
+job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory
+job.config.loader.properties.path=./__package/config/azure-blob-application.properties
 
 Review comment:
   Do you think it might be confusing to users what `__package` refers to? That 
is kind of a YARN implementation detail. Maybe documentation can help clarify, 
but if you can find a programatic way to hide that implementation detail, that 
would be nice too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-03-30 Thread GitBox
cameronlee314 commented on a change in pull request #79: Update doc and javadoc 
from config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r400365471
 
 

 ##
 File path: README.md
 ##
 @@ -61,13 +61,19 @@ Package 
[samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr
 Package 
[samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application)
 contains a small Samza application which consumes the real-time feeds from 
Wikipedia, extracts the metadata of the events, and calculates statistics of 
all edits in a 10-second window. You can start the app on the grid using the 
run-app.sh script:
 
 ```
-./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties
+./deploy/samza/bin/run-app.sh \
+  --config app.class=samza.examples.wikipedia.application.WikipediaApplication 
\
+  --config 
yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz
 \
+  --config job.name=wikipedia-application \
+  --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \
 
 Review comment:
   Do you have a clean way to describe how to specify submission configs in 
general? Based on your comments, it seems like standalone 
(`LocalApplicationRunner`) specifies a different set of submission configs than 
YARN (`RemoteApplicationRunner`), even though some of those configs are general 
to Samza (e.g. `app.class`).
   It would be good to have as few runner-specific steps as possible. 
`ApplicationRunner` is the interface, so it would be nice to not have to worry 
about the specific `ApplicationRunner` being used when trying to start the app. 
I admit that Samza does already do some environment-specific configs (e.g. 
YARN-specific configs are needed when using `YarnJobFactory`), but we should 
generally minimize that.
   I'm not sure if this works, but could we recommend standalone to pass the 
larger set of submission configs (similar to YARN) also? Then there would be 
more consistency. It would be easier to describe what submission configs are 
and how to specify them in general. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-03-27 Thread GitBox
kw2542 commented on a change in pull request #79: Update doc and javadoc from 
config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r399599247
 
 

 ##
 File path: README.md
 ##
 @@ -61,13 +61,19 @@ Package 
[samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr
 Package 
[samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application)
 contains a small Samza application which consumes the real-time feeds from 
Wikipedia, extracts the metadata of the events, and calculates statistics of 
all edits in a 10-second window. You can start the app on the grid using the 
run-app.sh script:
 
 ```
-./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties
+./deploy/samza/bin/run-app.sh \
+  --config app.class=samza.examples.wikipedia.application.WikipediaApplication 
\
+  --config 
yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz
 \
+  --config job.name=wikipedia-application \
+  --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \
+  --config 
job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory
 \
+  --config 
job.config.loader.properties.path=$PWD/deploy/samza/config/wikipedia-application.properties
 
 Review comment:
   For standalone applications, we do not need to but for Yarn ones, Yes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-03-27 Thread GitBox
kw2542 commented on a change in pull request #79: Update doc and javadoc from 
config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r399599247
 
 

 ##
 File path: README.md
 ##
 @@ -61,13 +61,19 @@ Package 
[samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr
 Package 
[samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application)
 contains a small Samza application which consumes the real-time feeds from 
Wikipedia, extracts the metadata of the events, and calculates statistics of 
all edits in a 10-second window. You can start the app on the grid using the 
run-app.sh script:
 
 ```
-./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties
+./deploy/samza/bin/run-app.sh \
+  --config app.class=samza.examples.wikipedia.application.WikipediaApplication 
\
+  --config 
yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz
 \
+  --config job.name=wikipedia-application \
+  --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \
+  --config 
job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory
 \
+  --config 
job.config.loader.properties.path=$PWD/deploy/samza/config/wikipedia-application.properties
 
 Review comment:
   Yes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-03-27 Thread GitBox
kw2542 commented on a change in pull request #79: Update doc and javadoc from 
config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r399599309
 
 

 ##
 File path: README.md
 ##
 @@ -61,13 +61,19 @@ Package 
[samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr
 Package 
[samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application)
 contains a small Samza application which consumes the real-time feeds from 
Wikipedia, extracts the metadata of the events, and calculates statistics of 
all edits in a 10-second window. You can start the app on the grid using the 
run-app.sh script:
 
 ```
-./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties
+./deploy/samza/bin/run-app.sh \
+  --config app.class=samza.examples.wikipedia.application.WikipediaApplication 
\
+  --config 
yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz
 \
+  --config job.name=wikipedia-application \
+  --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \
 
 Review comment:
   properties files cannot be simplified as they may also be used for 
standalone deployment where we do not need to feed the submission config.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] kw2542 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-03-27 Thread GitBox
kw2542 commented on a change in pull request #79: Update doc and javadoc from 
config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r399576572
 
 

 ##
 File path: README.md
 ##
 @@ -61,13 +61,19 @@ Package 
[samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr
 Package 
[samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application)
 contains a small Samza application which consumes the real-time feeds from 
Wikipedia, extracts the metadata of the events, and calculates statistics of 
all edits in a 10-second window. You can start the app on the grid using the 
run-app.sh script:
 
 ```
-./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties
+./deploy/samza/bin/run-app.sh \
+  --config app.class=samza.examples.wikipedia.application.WikipediaApplication 
\
+  --config 
yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz
 \
+  --config job.name=wikipedia-application \
+  --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \
 
 Review comment:
   submission related configs, such as job.name, app.class, job.factory.class 
and yarn.package.path needs to be explicitly passed in during submission since 
we are not reading config files anymore during submission.
   
   I am cleaning up properties file ATM as well.
   
   I believe the previous PR missed the instruction that all submission related 
configs needs to be provided explicitly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-03-27 Thread GitBox
cameronlee314 commented on a change in pull request #79: Update doc and javadoc 
from config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r399573818
 
 

 ##
 File path: README.md
 ##
 @@ -61,13 +61,19 @@ Package 
[samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr
 Package 
[samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application)
 contains a small Samza application which consumes the real-time feeds from 
Wikipedia, extracts the metadata of the events, and calculates statistics of 
all edits in a 10-second window. You can start the app on the grid using the 
run-app.sh script:
 
 ```
-./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties
+./deploy/samza/bin/run-app.sh \
+  --config app.class=samza.examples.wikipedia.application.WikipediaApplication 
\
+  --config 
yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz
 \
+  --config job.name=wikipedia-application \
+  --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \
+  --config 
job.config.loader.factory=org.apache.samza.config.loaders.PropertiesConfigLoaderFactory
 \
+  --config 
job.config.loader.properties.path=$PWD/deploy/samza/config/wikipedia-application.properties
 
 Review comment:
   There are several other places which did not have these additional configs 
(e.g. `run-event-hubs-zk-application.sh`, `CouchbaseTableExample.java`). Do 
those places need to be updated also?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #79: Update doc and javadoc from config factory to config loader

2020-03-27 Thread GitBox
cameronlee314 commented on a change in pull request #79: Update doc and javadoc 
from config factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79#discussion_r399574730
 
 

 ##
 File path: README.md
 ##
 @@ -61,13 +61,19 @@ Package 
[samza.examples.cookbook](https://github.com/apache/samza-hello-samza/tr
 Package 
[samza.examples.wikipedia.application](https://github.com/apache/samza-hello-samza/tree/master/src/main/java/samza/examples/wikipedia/application)
 contains a small Samza application which consumes the real-time feeds from 
Wikipedia, extracts the metadata of the events, and calculates statistics of 
all edits in a 10-second window. You can start the app on the grid using the 
run-app.sh script:
 
 ```
-./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/wikipedia-application.properties
+./deploy/samza/bin/run-app.sh \
+  --config app.class=samza.examples.wikipedia.application.WikipediaApplication 
\
+  --config 
yarn.package.path=file:///Users/kwu/workspace/hello-samza/target/hello-samza-1.5.0-SNAPSHOT-dist.tar.gz
 \
+  --config job.name=wikipedia-application \
+  --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \
 
 Review comment:
   These all seem to be copied from the properties file. It seems like it would 
be non-trivial to keep the properties file and this list of configs consistent, 
since they are in different places.
   https://github.com/apache/samza/pull/1256 indicates that you only need the 
config loader and properties path, but here you are adding several other 
configs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] kw2542 opened a new pull request #79: Update doc and javadoc from config factory to config loader

2020-03-26 Thread GitBox
kw2542 opened a new pull request #79: Update doc and javadoc from config 
factory to config loader
URL: https://github.com/apache/samza-hello-samza/pull/79
 
 
   Update doc and javadoc from config factory to config loader


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 merged pull request #78: [Minor] updating latest branch to use 1.5.0-SNAPSHOT for samza version

2020-03-19 Thread GitBox
cameronlee314 merged pull request #78: [Minor] updating latest branch to use 
1.5.0-SNAPSHOT for samza version
URL: https://github.com/apache/samza-hello-samza/pull/78
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 merged pull request #77: Merge latest branch into master and set version to 1.4.0

2020-03-19 Thread GitBox
cameronlee314 merged pull request #77: Merge latest branch into master and set 
version to 1.4.0
URL: https://github.com/apache/samza-hello-samza/pull/77
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] mynameborat commented on a change in pull request #77: Merge latest branch into master and set version to 1.4.0

2020-03-19 Thread GitBox
mynameborat commented on a change in pull request #77: Merge latest branch into 
master and set version to 1.4.0
URL: https://github.com/apache/samza-hello-samza/pull/77#discussion_r395190224
 
 

 ##
 File path: bin/deploy.sh
 ##
 @@ -23,4 +23,4 @@ base_dir=`pwd`
 
 mvn clean package
 mkdir -p $base_dir/deploy/samza
-tar -xvf $base_dir/target/hello-samza-1.2.0-dist.tar.gz -C 
$base_dir/deploy/samza
+tar -xvf $base_dir/target/hello-samza-1.1.0-dist.tar.gz -C 
$base_dir/deploy/samza
 
 Review comment:
   should this be 1.4.0?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 opened a new pull request #78: [Minor] updating latest branch to use 1.5.0-SNAPSHOT for samza version

2020-03-18 Thread GitBox
cameronlee314 opened a new pull request #78: [Minor] updating latest branch to 
use 1.5.0-SNAPSHOT for samza version
URL: https://github.com/apache/samza-hello-samza/pull/78
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 opened a new pull request #77: Merge latest branch into master and set version to 1.4.0

2020-03-18 Thread GitBox
cameronlee314 opened a new pull request #77: Merge latest branch into master 
and set version to 1.4.0
URL: https://github.com/apache/samza-hello-samza/pull/77
 
 
   The Samza 1.4.0 release was just completed, so updating master to be 
up-to-date with latest. Most of this diff is by doing a `git merge latest`, but 
there were a few minor inconsistencies that I also cleaned up. I also removed 
the `-SNAPSHOT` part from the Samza version.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on issue #76: Sync latest branch with master

2020-03-18 Thread GitBox
cameronlee314 commented on issue #76: Sync latest branch with master
URL: https://github.com/apache/samza-hello-samza/pull/76#issuecomment-600907533
 
 
   > Thanks for the changes. Was this a one off miss on our end or is there a 
lack of explicit guideline on sync between latest & master and when to cherry 
pick commits?
   
   I think there is a lack of explicit guidelines on how we want to manage 
`master` and `latest`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 merged pull request #76: Sync latest branch with master

2020-03-18 Thread GitBox
cameronlee314 merged pull request #76: Sync latest branch with master
URL: https://github.com/apache/samza-hello-samza/pull/76
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 merged pull request #74: [minor] remove unused files that seem to be left over from an improper merge/cleanup

2020-03-18 Thread GitBox
cameronlee314 merged pull request #74: [minor] remove unused files that seem to 
be left over from an improper merge/cleanup
URL: https://github.com/apache/samza-hello-samza/pull/74
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 opened a new pull request #76: Sync latest branch with master

2020-03-18 Thread GitBox
cameronlee314 opened a new pull request #76: Sync latest branch with master
URL: https://github.com/apache/samza-hello-samza/pull/76
 
 
   When updating samza-hello-samza to use Samza 1.4, I noticed that the 
`latest` branch and the `master` branch were out-of-sync. I cherry-picked some 
commits that were only checked in to `master`. I also made some other minor 
changes which were on `master` but not on `latest`.
   
   Cherry-picks:
   f4cd658b751bac46cbf42b2c612c68ea23de2d47 (Adding hello-samza example for 
kinesis)
   674d842e4b75d9003eaff9722596bb4c61db9fe2 (Adding Samza SQL Examples)
   1c16cf03897b02178b695bb5c01ff90a9ca16406 (fix sql's join and aggregate notes 
typo)
   67c989219353c7dfa60584ca0bc5a31ba302ab28 (Fixed rat issues on master)
   
   Other minor changes:
   kafka-console-consumer command in `README.md`
   add config to `conf/yarn-site.xml`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 merged pull request #75: add license for PageViewAvroRecord

2020-03-18 Thread GitBox
cameronlee314 merged pull request #75: add license for PageViewAvroRecord
URL: https://github.com/apache/samza-hello-samza/pull/75
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] lakshmi-manasa-g opened a new pull request #75: add license for PageViewAvroRecord

2020-03-17 Thread GitBox
lakshmi-manasa-g opened a new pull request #75: add license for 
PageViewAvroRecord
URL: https://github.com/apache/samza-hello-samza/pull/75
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 opened a new pull request #74: [minor] remove unused files that seem to be left over from an improper merge/cleanup

2020-03-17 Thread GitBox
cameronlee314 opened a new pull request #74: [minor] remove unused files that 
seem to be left over from an improper merge/cleanup
URL: https://github.com/apache/samza-hello-samza/pull/74
 
 
   I was trying to sync `master` with `latest`, and these files were on 
`master` but not on `latest`. It looks like `AdClick` and `PageView` had gotten 
moved to a different package, but the old versions never got deleted. 
`AvroSerDeFactory` was committed with 
https://github.com/apache/samza-hello-samza/pull/41, but the other parts of 
that PR no longer exist (seems like it is replaced by 
https://github.com/apache/samza-hello-samza/pull/46).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 merged pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-03-16 Thread GitBox
cameronlee314 merged pull request #71: SAMZA-2437: Sample for producing to 
Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] lakshmi-manasa-g commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-02-14 Thread GitBox
lakshmi-manasa-g commented on a change in pull request #71: SAMZA-2437: Sample 
for producing to Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r379603189
 
 

 ##
 File path: pom.xml
 ##
 @@ -206,6 +206,26 @@ under the License.
   guava
   23.0
 
+
 
 Review comment:
   updating jackson-core to 2.10.0 in Samza-azure 
(https://github.com/apache/samza/pull/1277)
   after pulling in recent commits from 'latest' branch and updating 
jackson-core in samza-azure, these dependencies are not needed here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] JLLeitschuh opened a new pull request #73: [SECURITY] Use HTTPS to resolve dependencies in Maven Build

2020-02-10 Thread GitBox
JLLeitschuh opened a new pull request #73: [SECURITY] Use HTTPS to resolve 
dependencies in Maven Build
URL: https://github.com/apache/samza-hello-samza/pull/73
 
 
   
[![mitm_build](https://user-images.githubusercontent.com/1323708/59226671-90645200-8ba1-11e9-8ab3-39292bef99e9.jpeg)](https://medium.com/@jonathan.leitschuh/want-to-take-over-the-java-ecosystem-all-you-need-is-a-mitm-1fc329d898fb?source=friends_link=3c99970c55a899ad9ef41f126efcde0e)
   
   - [Want to take over the Java ecosystem? All you need is a 
MITM!](https://medium.com/@jonathan.leitschuh/want-to-take-over-the-java-ecosystem-all-you-need-is-a-mitm-1fc329d898fb?source=friends_link=3c99970c55a899ad9ef41f126efcde0e)
   - [Update: Want to take over the Java ecosystem? All you need is a 
MITM!](https://medium.com/bugbountywriteup/update-want-to-take-over-the-java-ecosystem-all-you-need-is-a-mitm-d069d253fe23?source=friends_link=8c8e52a7d57b98d0b7e541665688b454)
   
   ---
   
   This is a security fix for a  vulnerability in your [Apache 
Maven](https://maven.apache.org/) `pom.xml` file(s).
   
   The build files indicate that this project is resolving dependencies over 
HTTP instead of HTTPS.
   This leaves your build vulnerable to allowing a [Man in the 
Middle](https://en.wikipedia.org/wiki/Man-in-the-middle_attack) (MITM) 
attackers to execute arbitrary code on your or your computer or CI/CD system.
   
   This vulnerability has a CVSS v3.0 Base Score of 
[8.1/10](https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H).
   
   [POC 
code](https://max.computer/blog/how-to-take-over-the-computer-of-any-java-or-clojure-or-scala-developer/)
 has existed since 2014 to maliciously compromise a JAR file in-flight.
   MITM attacks against HTTP are [increasingly 
common](https://security.stackexchange.com/a/12050), for example [Comcast is 
known to have done it to their own 
users](https://thenextweb.com/insights/2017/12/11/comcast-continues-to-inject-its-own-code-into-websites-you-visit/#).
   
   This contribution is a part of a submission to the [GitHub Security 
Lab](https://securitylab.github.com/) Bug Bounty program.
   
   ## Detecting this and Future Vulnerabilities
   
   This vulnerability was automatically detected by 
[LGTM.com](https://lgtm.com) using this [CodeQL 
Query](https://lgtm.com/rules/155648721/).
   
   As of September 2019 LGTM.com and Semmle are [officially a part of 
GitHub](https://github.blog/2019-09-18-github-welcomes-semmle/).
   
   You can automatically detect future vulnerabilities like this by enabling 
the free (for open-source) [LGTM App](https://github.com/marketplace/lgtm).
   
   I'm not an employee of GitHub nor of Semmle, I'm simply a user of 
[LGTM.com](https://lgtm.com) and an open-source security researcher.
   
   ## Source
   
   Yes, this contribution was automatically generated, however, the code to 
generate this PR was lovingly hand crafted to bring this security fix to your 
repository.
   
   The source code that generated and submitted this PR can be found here:
   
[JLLeitschuh/bulk-security-pr-generator](https://github.com/JLLeitschuh/bulk-security-pr-generator)
   
   ## Opting-Out
   
   If you'd like to opt-out of future automated security vulnerability fixes 
like this, please consider adding a file called
   `.github/GH-ROBOTS.txt` to your repository with the line:
   
   ```
   User-agent: JLLeitschuh/bulk-security-pr-generator
   Disallow: *
   ```
   
   This bot will respect the [ROBOTS.txt](https://moz.com/learn/seo/robotstxt) 
format for future contributions.
   
   Alternatively, if this project is no longer actively maintained, consider 
[archiving](https://help.github.com/en/github/creating-cloning-and-archiving-repositories/about-archiving-repositories)
 the repository.
   
   ## CLA Requirements
   
   _This section is only relevant if your project requires contributors to sign 
a Contributor License Agreement (CLA) for external contributions._
   
   It is unlikely that I'll be able to directly sign CLAs. However, all 
contributed commits are already automatically signed-off.
   
   > The meaning of a signoff depends on the project, but it typically 
certifies that committer has the rights to submit this work under the same 
license and agrees to a Developer Certificate of Origin 
   > (see 
[https://developercertificate.org/](https://developercertificate.org/) for more 
information).
   >
   > \- [Git Commit Signoff documentation](https://developercertificate.org/)
   
   If signing your organization's CLA is a strict-requirement for merging this 
contribution, please feel free to close this PR.
   
   ## Tracking
   
   All PR's generated as part of this fix are tracked here: 
   https://github.com/JLLeitschuh/bulk-security-pr-generator/issues/2


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use 

[GitHub] [samza-hello-samza] cameronlee314 opened a new pull request #72: SAMZA-2449: Create an example job in samza-hello-samza for job coordinator split deployment

2020-01-30 Thread GitBox
cameronlee314 opened a new pull request #72: SAMZA-2449: Create an example job 
in samza-hello-samza for job coordinator split deployment
URL: https://github.com/apache/samza-hello-samza/pull/72
 
 
   Feature: Adding an example for how to set up job coordinator dependency 
isolation in samza-hello-samza
   
   Changes:
   1. Updated build.gradle to include tasks to build framework artifacts.
   2. Added a new config for a job to be deployed in a job coordinator 
dependency isolation mode.
   
   Usage instructions:
   1. Normally, the samza-hello-samza artifact would be built by running 
`./gradlew distTar`. For dependency isolation, run `./gradlew distTar 
frameworkApiDistTar frameworkInfrastructureDistTar`, which will build the 
framework API and infrastructure packages.
   2. The `wikipedia-application-with-framework.properties` config is set up to 
run in dependency isolation mode. Use that as the argument to the 
`--config-path` option when running with `run-app.sh`. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-01-28 Thread GitBox
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for 
producing to Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371968280
 
 

 ##
 File path: pom.xml
 ##
 @@ -206,6 +206,26 @@ under the License.
   guava
   23.0
 
+
 
 Review comment:
   Does that mean that the azure library needs Jackson, but it does not pull it 
in transitively? That seems odd. Or maybe check if some other dependency is 
using a version of Jackson that is incompatible with the azure client.
   In any case, can you please add a comment about why this is necessary? It 
might be something you need to note on a user guide if there is one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-01-28 Thread GitBox
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for 
producing to Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371968359
 
 

 ##
 File path: pom.xml
 ##
 @@ -206,6 +206,26 @@ under the License.
   guava
   23.0
 
+
+
+  com.fasterxml.jackson.core
+  jackson-core
+  2.10.0
+
+
 
 Review comment:
   minor: whitespace


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-01-28 Thread GitBox
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for 
producing to Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371968434
 
 

 ##
 File path: pom.xml
 ##
 @@ -206,6 +206,26 @@ under the License.
   guava
   23.0
 
+
+
+  com.fasterxml.jackson.core
+  jackson-core
+  2.10.0
+
+
+
+  com.fasterxml.jackson.core
+  jackson-databind
+  2.10.0
+
+
+
+
 
 Review comment:
   minor: whitespace


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] lakshmi-manasa-g commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-01-28 Thread GitBox
lakshmi-manasa-g commented on a change in pull request #71: SAMZA-2437: Sample 
for producing to Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371926907
 
 

 ##
 File path: src/main/java/samza/examples/azure/AzureBlobApplication.java
 ##
 @@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package samza.examples.azure;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.samza.application.StreamApplication;
+import org.apache.samza.application.descriptors.StreamApplicationDescriptor;
+import org.apache.samza.operators.MessageStream;
+import org.apache.samza.operators.OutputStream;
+import org.apache.samza.serializers.JsonSerdeV2;
+import org.apache.samza.serializers.NoOpSerde;
+import org.apache.samza.system.descriptors.GenericOutputDescriptor;
+import org.apache.samza.system.descriptors.GenericSystemDescriptor;
+import org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor;
+import org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor;
+import samza.examples.azure.data.PageViewAvroRecord;
+import samza.examples.cookbook.data.PageView;
+
+/**
+ * In this example, we demonstrate sending blobs to Azure Blob Storage.
+ * This Samza job reads from Kafka topic "page-view-azure-blob-input" and 
produces blobs to Azure-Container "oss-testcontainer" in your Azure Storage 
account.
+ *
+ * Currently, Samza supports sending Avro files are blobs.
+ * Hence the incoming messages into the Samza job have to be converted to an 
Avro record.
+ * For this job, we use input message as {@link 
samza.examples.cookbook.data.PageView} and
+ * covert it to an Avro record defined as {@link 
samza.examples.azure.data.PageViewAvroRecord}.
+ *
+ * To run the below example:
+ *
+ * 
+ *   
+ * Replace your-azure-storage-account-name and 
your-azure-storage-account-key with details of your Azure Storage Account.
+ *   
+ *   
+ * Ensure that the topic "page-view-azure-blob-input" is created  
+ * ./deploy/kafka/bin/kafka-topics.sh  --zookeeper localhost:2181 --create 
--topic page-view-azure-blob-input --partitions 1 --replication-factor 1
+ *   
+ *   
+ * Run the application using the run-app.sh script 
+ * ./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/azure-blob-application.properties
+ *   
+ *   
+ * Produce some messages to the "page-view-azure-blob-input" topic 
+ * ./deploy/kafka/bin/kafka-console-producer.sh --topic 
page-view-azure-blob-input --broker-list localhost:9092 
+ * {"userId": "user1", "country": "india", "pageId":"google.com"} 
+ * {"userId": "user2", "country": "france", "pageId":"facebook.com"} 
+ * {"userId": "user3", "country": "china", "pageId":"yahoo.com"} 
+ * {"userId": "user4", "country": "italy", "pageId":"linkedin.com"} 
+ * {"userId": "user5", "country": "germany", "pageId":"amazon.com"} 
+ * {"userId": "user6", "country": "denmark", "pageId":"apple.com"} 
+ *   
+ *   
+ *Seeing Output:
+ *
+ *  
+ *   See blobs in your Azure portal at 
https://.blob.core.windows.net/oss-testcontainer/PageViewEventStream/.avro
+ *  
+ *  
+ *   system-name "oss-testcontainer" in configs and code below maps to 
Azure-Container in Azure Storage account.
+ *  
+ *  
+ *is of the format /MM/dd/HH/mm-ss-randomString.avro. 
Hence navigate through the virtual folders on the portal to see your blobs.
+ *  
+ *  
+ *   Due to network calls, allow a few minutes for blobs to appear on the 
portal.
+ *  
+ *  
+ *   Config "maxMessagesPerBlob=2" ensures that a blob is created per 2 
input messages. Adjust input or config accordingly.
+ *  
+ *
+ *   
+ * 
+ */
+public class AzureBlobApplication implements StreamApplication {
+  private static final List KAFKA_CONSUMER_ZK_CONNECT = 
ImmutableList.of("localhost:2181");
+  private static final List KAFKA_PRODUCER_BOOTSTRAP_SERVERS = 
ImmutableList.of("localhost:9092");
+  private 

[GitHub] [samza-hello-samza] lakshmi-manasa-g commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-01-28 Thread GitBox
lakshmi-manasa-g commented on a change in pull request #71: SAMZA-2437: Sample 
for producing to Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371915951
 
 

 ##
 File path: pom.xml
 ##
 @@ -206,6 +206,26 @@ under the License.
   guava
   23.0
 
+
 
 Review comment:
   No the app does not need them directly.
   But azure-storage-blob depends on azure-core which in-turn needs Jackson.
   
   removing these dependencies throws 
   `java.lang.NoClassDefFoundError: com/fasterxml/jackson/core/TSFBuilder
   at 
com.fasterxml.jackson.dataformat.xml.XmlMapper.(XmlMapper.java:122) 
~[jackson-dataformat-xml-2.10.0.jar:2.10.0]
   at 
com.azure.core.implementation.serializer.jackson.JacksonAdapter.(JacksonAdapter.java:75)
 ~[azure-core-1.0.0.jar:?]
   at 
com.azure.core.implementation.serializer.jackson.JacksonAdapter.createDefaultSerializerAdapter(JacksonAdapter.java:108)
 ~[azure-core-1.0.0.jar:?]
   at 
com.azure.core.implementation.RestProxy.createDefaultSerializer(RestProxy.java:629)
 ~[azure-core-1.0.0.jar:?]
   at 
com.azure.core.implementation.RestProxy.create(RestProxy.java:691) 
~[azure-core-1.0.0.jar:?]
   at 
com.azure.storage.blob.implementation.ServicesImpl.(ServicesImpl.java:58) 
~[azure-storage-blob-12.0.1.jar:?]
   at 
com.azure.storage.blob.implementation.AzureBlobStorageImpl.(AzureBlobStorageImpl.java:213)
 ~[azure-storage-blob-12.0.1.jar:?]
   at 
com.azure.storage.blob.implementation.AzureBlobStorageBuilder.build(AzureBlobStorageBuilder.java:90)
 ~[azure-storage-blob-12.0.1.jar:?]
   at 
com.azure.storage.blob.BlobServiceAsyncClient.(BlobServiceAsyncClient.java:90)
 ~[azure-storage-blob-12.0.1.jar:?]
   at 
com.azure.storage.blob.BlobServiceClientBuilder.buildAsyncClient(BlobServiceClientBuilder.java:103)
 ~[azure-storage-blob-12.0.1.jar:?]
   at 
org.apache.samza.system.azureblob.producer.AzureBlobSystemProducer.setupAzureContainer(AzureBlobSystemProducer.java:370)
 ~[samza-azure_2.11-1.4.903-SNAPSHOT.jar:?]`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-01-27 Thread GitBox
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for 
producing to Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371586350
 
 

 ##
 File path: src/main/java/samza/examples/azure/AzureBlobApplication.java
 ##
 @@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package samza.examples.azure;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.samza.application.StreamApplication;
+import org.apache.samza.application.descriptors.StreamApplicationDescriptor;
+import org.apache.samza.operators.MessageStream;
+import org.apache.samza.operators.OutputStream;
+import org.apache.samza.serializers.JsonSerdeV2;
+import org.apache.samza.serializers.NoOpSerde;
+import org.apache.samza.system.descriptors.GenericOutputDescriptor;
+import org.apache.samza.system.descriptors.GenericSystemDescriptor;
+import org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor;
+import org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor;
+import samza.examples.azure.data.PageViewAvroRecord;
+import samza.examples.cookbook.data.PageView;
+
+/**
+ * In this example, we demonstrate sending blobs to Azure Blob Storage.
+ * This Samza job reads from Kafka topic "page-view-azure-blob-input" and 
produces blobs to Azure-Container "oss-testcontainer" in your Azure Storage 
account.
+ *
+ * Currently, Samza supports sending Avro files are blobs.
+ * Hence the incoming messages into the Samza job have to be converted to an 
Avro record.
+ * For this job, we use input message as {@link 
samza.examples.cookbook.data.PageView} and
+ * covert it to an Avro record defined as {@link 
samza.examples.azure.data.PageViewAvroRecord}.
+ *
+ * To run the below example:
+ *
+ * 
+ *   
+ * Replace your-azure-storage-account-name and 
your-azure-storage-account-key with details of your Azure Storage Account.
+ *   
+ *   
+ * Ensure that the topic "page-view-azure-blob-input" is created  
+ * ./deploy/kafka/bin/kafka-topics.sh  --zookeeper localhost:2181 --create 
--topic page-view-azure-blob-input --partitions 1 --replication-factor 1
+ *   
+ *   
+ * Run the application using the run-app.sh script 
+ * ./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/azure-blob-application.properties
+ *   
+ *   
+ * Produce some messages to the "page-view-azure-blob-input" topic 
+ * ./deploy/kafka/bin/kafka-console-producer.sh --topic 
page-view-azure-blob-input --broker-list localhost:9092 
+ * {"userId": "user1", "country": "india", "pageId":"google.com"} 
+ * {"userId": "user2", "country": "france", "pageId":"facebook.com"} 
+ * {"userId": "user3", "country": "china", "pageId":"yahoo.com"} 
+ * {"userId": "user4", "country": "italy", "pageId":"linkedin.com"} 
+ * {"userId": "user5", "country": "germany", "pageId":"amazon.com"} 
+ * {"userId": "user6", "country": "denmark", "pageId":"apple.com"} 
+ *   
+ *   
+ *Seeing Output:
+ *
+ *  
+ *   See blobs in your Azure portal at 
https://.blob.core.windows.net/oss-testcontainer/PageViewEventStream/.avro
+ *  
+ *  
+ *   system-name "oss-testcontainer" in configs and code below maps to 
Azure-Container in Azure Storage account.
+ *  
+ *  
+ *is of the format /MM/dd/HH/mm-ss-randomString.avro. 
Hence navigate through the virtual folders on the portal to see your blobs.
+ *  
+ *  
+ *   Due to network calls, allow a few minutes for blobs to appear on the 
portal.
+ *  
+ *  
+ *   Config "maxMessagesPerBlob=2" ensures that a blob is created per 2 
input messages. Adjust input or config accordingly.
+ *  
+ *
+ *   
+ * 
+ */
+public class AzureBlobApplication implements StreamApplication {
+  private static final List KAFKA_CONSUMER_ZK_CONNECT = 
ImmutableList.of("localhost:2181");
+  private static final List KAFKA_PRODUCER_BOOTSTRAP_SERVERS = 
ImmutableList.of("localhost:9092");
+  private 

[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-01-27 Thread GitBox
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for 
producing to Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371586933
 
 

 ##
 File path: src/main/java/samza/examples/azure/AzureBlobApplication.java
 ##
 @@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package samza.examples.azure;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.samza.application.StreamApplication;
+import org.apache.samza.application.descriptors.StreamApplicationDescriptor;
+import org.apache.samza.operators.MessageStream;
+import org.apache.samza.operators.OutputStream;
+import org.apache.samza.serializers.JsonSerdeV2;
+import org.apache.samza.serializers.NoOpSerde;
+import org.apache.samza.system.descriptors.GenericOutputDescriptor;
+import org.apache.samza.system.descriptors.GenericSystemDescriptor;
+import org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor;
+import org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor;
+import samza.examples.azure.data.PageViewAvroRecord;
+import samza.examples.cookbook.data.PageView;
+
+/**
+ * In this example, we demonstrate sending blobs to Azure Blob Storage.
+ * This Samza job reads from Kafka topic "page-view-azure-blob-input" and 
produces blobs to Azure-Container "oss-testcontainer" in your Azure Storage 
account.
+ *
+ * Currently, Samza supports sending Avro files are blobs.
+ * Hence the incoming messages into the Samza job have to be converted to an 
Avro record.
+ * For this job, we use input message as {@link 
samza.examples.cookbook.data.PageView} and
+ * covert it to an Avro record defined as {@link 
samza.examples.azure.data.PageViewAvroRecord}.
+ *
+ * To run the below example:
+ *
+ * 
+ *   
+ * Replace your-azure-storage-account-name and 
your-azure-storage-account-key with details of your Azure Storage Account.
+ *   
+ *   
+ * Ensure that the topic "page-view-azure-blob-input" is created  
+ * ./deploy/kafka/bin/kafka-topics.sh  --zookeeper localhost:2181 --create 
--topic page-view-azure-blob-input --partitions 1 --replication-factor 1
+ *   
+ *   
+ * Run the application using the run-app.sh script 
+ * ./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/azure-blob-application.properties
+ *   
+ *   
+ * Produce some messages to the "page-view-azure-blob-input" topic 
+ * ./deploy/kafka/bin/kafka-console-producer.sh --topic 
page-view-azure-blob-input --broker-list localhost:9092 
+ * {"userId": "user1", "country": "india", "pageId":"google.com"} 
+ * {"userId": "user2", "country": "france", "pageId":"facebook.com"} 
+ * {"userId": "user3", "country": "china", "pageId":"yahoo.com"} 
+ * {"userId": "user4", "country": "italy", "pageId":"linkedin.com"} 
+ * {"userId": "user5", "country": "germany", "pageId":"amazon.com"} 
+ * {"userId": "user6", "country": "denmark", "pageId":"apple.com"} 
+ *   
+ *   
+ *Seeing Output:
+ *
+ *  
+ *   See blobs in your Azure portal at 
https://.blob.core.windows.net/oss-testcontainer/PageViewEventStream/.avro
+ *  
+ *  
+ *   system-name "oss-testcontainer" in configs and code below maps to 
Azure-Container in Azure Storage account.
+ *  
+ *  
+ *is of the format /MM/dd/HH/mm-ss-randomString.avro. 
Hence navigate through the virtual folders on the portal to see your blobs.
+ *  
+ *  
+ *   Due to network calls, allow a few minutes for blobs to appear on the 
portal.
+ *  
+ *  
+ *   Config "maxMessagesPerBlob=2" ensures that a blob is created per 2 
input messages. Adjust input or config accordingly.
+ *  
+ *
+ *   
+ * 
+ */
+public class AzureBlobApplication implements StreamApplication {
+  private static final List KAFKA_CONSUMER_ZK_CONNECT = 
ImmutableList.of("localhost:2181");
+  private static final List KAFKA_PRODUCER_BOOTSTRAP_SERVERS = 
ImmutableList.of("localhost:9092");
+  private 

[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-01-27 Thread GitBox
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for 
producing to Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371585315
 
 

 ##
 File path: src/main/config/azure-blob-application.properties
 ##
 @@ -0,0 +1,37 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Job
+job.factory.class=org.apache.samza.job.yarn.YarnJobFactory
+job.name=azure-blob
+
+# YARN package path
+yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-dist.tar.gz
+
+# StreamApplication class
+app.class=samza.examples.azure.AzureBlobApplication
+
+#Azure blob essential configs
+systems.oss-testcontainer.samza.factory=org.apache.samza.system.azureblob.AzureBlobSystemFactory
+sensitive.systems.oss-testcontainer.azureblob.account.name=your-azure-storage-account-name
+sensitive.systems.oss-testcontainer.azureblob.account.key=your-azure-storage-account-key
+
+#Azure blob config - to created a blob per 2 input kafka messages
+systems.oss-testcontainer.azureblob.maxMessagesPerBlob=2
+
+# Add configuration to disable checkpointing for this job once it is available 
in the Coordinator Stream model
+# See 
https://issues.apache.org/jira/browse/SAMZA-465?focusedCommentId=14533346=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14533346
 for more details
 
 Review comment:
   Is this necessary?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-01-27 Thread GitBox
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for 
producing to Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371586679
 
 

 ##
 File path: src/main/java/samza/examples/azure/data/PageViewAvroRecord.java
 ##
 @@ -0,0 +1,48 @@
+package samza.examples.azure.data;
+
+import java.io.Serializable;
+import org.apache.avro.AvroRuntimeException;
+import samza.examples.cookbook.data.PageView;
+
+public class PageViewAvroRecord extends 
org.apache.avro.specific.SpecificRecordBase
+implements org.apache.avro.specific.SpecificRecord, Serializable {
+  public final org.apache.avro.Schema SCHEMA = org.apache.avro.Schema.parse(
+  
"{\"type\":\"record\",\"name\":\"PageViewAvroRecord\",\"namespace\":\"org.apache.samza.examples.events\",
 \"fields\":[{\"name\": \"userId\", \"type\": \"string\"}, {\"name\": 
\"country\", \"type\": \"string\"}, {\"name\": \"pageId\", \"type\": 
\"string\"}]}");
+
+  private String userId;
+  private String country;
+  private String pageId;
+
+  public static PageViewAvroRecord buildPageViewRecord(PageView pageView) {
+PageViewAvroRecord record = new PageViewAvroRecord();
+record.put(0, pageView.userId);
 
 Review comment:
   Would it be cleaner to do `record.userId = pageView.userId`? Same for the 
other fields below.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-01-27 Thread GitBox
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for 
producing to Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371585166
 
 

 ##
 File path: src/main/config/azure-blob-application.properties
 ##
 @@ -0,0 +1,37 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Job
+job.factory.class=org.apache.samza.job.yarn.YarnJobFactory
+job.name=azure-blob
+
+# YARN package path
+yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-dist.tar.gz
+
+# StreamApplication class
+app.class=samza.examples.azure.AzureBlobApplication
+
+#Azure blob essential configs
+systems.oss-testcontainer.samza.factory=org.apache.samza.system.azureblob.AzureBlobSystemFactory
+sensitive.systems.oss-testcontainer.azureblob.account.name=your-azure-storage-account-name
+sensitive.systems.oss-testcontainer.azureblob.account.key=your-azure-storage-account-key
+
+#Azure blob config - to created a blob per 2 input kafka messages
+systems.oss-testcontainer.azureblob.maxMessagesPerBlob=2
 
 Review comment:
   "oss-testcontainer" is not really a descriptive term here. Can you just use 
"azure-blob" (or something like that)? If it needs to be a specific format, 
then please explain that in the comments for this file.
   It looks like you put instructions in `AzureBlobApplication`. Maybe put a 
note in this file that there are usage instructions in that class. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for producing to Azure Blob Storage

2020-01-27 Thread GitBox
cameronlee314 commented on a change in pull request #71: SAMZA-2437: Sample for 
producing to Azure Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71#discussion_r371584567
 
 

 ##
 File path: pom.xml
 ##
 @@ -206,6 +206,26 @@ under the License.
   guava
   23.0
 
+
 
 Review comment:
   Does the app depend directly on Jackson? If not, then you shouldn't need 
these dependencies.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 merged pull request #70: SAMZA-2433: Use log4j2 in samza-hello-samza

2020-01-23 Thread GitBox
cameronlee314 merged pull request #70: SAMZA-2433: Use log4j2 in 
samza-hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/70
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza

2020-01-23 Thread GitBox
cameronlee314 commented on issue #70: SAMZA-2433: Use log4j2 in 
samza-hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/70#issuecomment-577936234
 
 
   To clarify a bit further in the split deployment case: If a job is set up to 
use log4j2 in the non-split-deployment case, then it will still work in the 
split deployment case if the split deployment framework is set up for log4j2.
   The impact of split deployment is that a particular framework package can 
only support one of log4j v1 or v2 as the slf4j binding.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza

2020-01-23 Thread GitBox
cameronlee314 commented on issue #70: SAMZA-2433: Use log4j2 in 
samza-hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/70#issuecomment-577935029
 
 
   > @cameronlee314 does this mean that we'd require jobs to upgrade to log4j2 
as well?
   
   No, it is not required for jobs to upgrade to log4j2. If not using split 
deployment, then both log4j v1 and v2 would continue to work as is. If using 
split deployment, then the split deployment framework package determines 
whether log4j v1 or v2 needs to be used by a job (although I have currently 
only prototyped supporting log4j2).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] prateekm commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza

2020-01-23 Thread GitBox
prateekm commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/70#issuecomment-577891882
 
 
   @cameronlee314 does this mean that we'd require jobs to upgrade to log4j2 as 
well?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] prateekm commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza

2020-01-23 Thread GitBox
prateekm commented on issue #70: SAMZA-2433: Use log4j2 in samza-hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/70#issuecomment-577829358
 
 
   @PawasChhokra Can you take a look as well?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] lakshmi-manasa-g opened a new pull request #71: Sample for producing to Azure Blob Storage

2020-01-23 Thread GitBox
lakshmi-manasa-g opened a new pull request #71: Sample for producing to Azure 
Blob Storage
URL: https://github.com/apache/samza-hello-samza/pull/71
 
 
   Feature: Sample Samza job related to SEP-26: Azure Blob System Producer
   
   Changes: New high-level Yarn job "AzureBlobApplication" added to the samples
   
   Tests: Sample job successfully produces blobs for configured Azure Storage 
Account. Confirmed on Azure portal.
   
   Upgrade instructions: Backwards compatible change as its a completely new 
job and hence no upgrade needed.
   
   Usage instructions: Add Azure Storage Account details to the configs of the 
Samza job. Step-by-step instructions to run the job are provided in the 
javadocs of the job.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 opened a new pull request #70: SAMZA-2433: Use log4j2 in samza-hello-samza

2020-01-14 Thread GitBox
cameronlee314 opened a new pull request #70: SAMZA-2433: Use log4j2 in 
samza-hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/70
 
 
   Issues: Log4j v1 is EOL and log4j2 is generally more performant and has a 
better module structure. Also, it is easier to handle split deployment 
functionality when using log4j2.
   
   Changes:
   Removed log4j v1 dependencies and added log4j2 dependencies (for both Gradle 
and Maven)
   Added log4j2.xml and removed log4j.xml
   Cleaned up some of the dependency specifications in order to make it easier 
to properly exclude log4j1 and include log4j2
   
   Tests: Ran WikipediaApplication for a Gradle build and for a Maven build; 
verified that the logs showed up properly and that the job had output data.
   
   API Changes: None
   
   Upgrade Instructions: None
   
   Usage Instructions: No changes to existing build/deployment flows


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] shanthoosh commented on issue #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published

2020-01-14 Thread GitBox
shanthoosh commented on issue #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 
has been published
URL: https://github.com/apache/samza-hello-samza/pull/69#issuecomment-574340389
 
 
   @kw2542 Thanks for the changes. Merged the patch to trunk.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] shanthoosh merged pull request #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published

2020-01-14 Thread GitBox
shanthoosh merged pull request #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 
has been published
URL: https://github.com/apache/samza-hello-samza/pull/69
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] kw2542 commented on issue #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published

2020-01-14 Thread GitBox
kw2542 commented on issue #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has 
been published
URL: https://github.com/apache/samza-hello-samza/pull/69#issuecomment-574309641
 
 
   Updated gradle.properties.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] shanthoosh commented on issue #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published

2020-01-13 Thread GitBox
shanthoosh commented on issue #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 
has been published
URL: https://github.com/apache/samza-hello-samza/pull/69#issuecomment-573990991
 
 
   Hadoop version used in OSS samza is 2.7.1 & here's it's set to 2.6.1 in 
gradle.properties. Would be better to bump-up the kafka-version to match with 
recent samza release.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] kw2542 opened a new pull request #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 has been published

2020-01-09 Thread GitBox
kw2542 opened a new pull request #69: Update POM to 1.4.0-SNAPSHOT as samza 1.3 
has been published
URL: https://github.com/apache/samza-hello-samza/pull/69
 
 
   In order to be compatible with Hello Samza documentation, POM needs to be 
updated to 1.4.0-SNAPSHOT for latest branch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] xinyuiscool merged pull request #68: Update dependencies to match OSS documentation

2019-11-08 Thread GitBox
xinyuiscool merged pull request #68: Update dependencies to match OSS 
documentation
URL: https://github.com/apache/samza-hello-samza/pull/68
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] kw2542 opened a new pull request #68: Update dependencies to match OSS documentation

2019-11-08 Thread GitBox
kw2542 opened a new pull request #68: Update dependencies to match OSS 
documentation
URL: https://github.com/apache/samza-hello-samza/pull/68
 
 
   1. In https://samza.apache.org/startup/hello-samza/latest/, we are supposed 
to pull 1.3.0-SNAPSHOT of samza to create a hello samza 1.3.0-SNAPSHOT, 
updating pom.xml to match
   2. Use 2.9.2 of Hadoop Yarn which supports HttpFileSystem


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #67: SAMZA-2285 - Sample async application for hello-samza

2019-08-12 Thread GitBox
cameronlee314 commented on a change in pull request #67: SAMZA-2285 - Sample 
async application for hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/67#discussion_r313183986
 
 

 ##
 File path: bin/run-wikipedia-async-application.sh
 ##
 @@ -0,0 +1,30 @@
+#!/bin/bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+home_dir=`pwd`
 
 Review comment:
   Do you need this script? It looks like most of the other examples don't have 
this, since they can be deployed using the `run-app.sh` script (see README).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] cameronlee314 commented on a change in pull request #67: SAMZA-2285 - Sample async application for hello-samza

2019-08-12 Thread GitBox
cameronlee314 commented on a change in pull request #67: SAMZA-2285 - Sample 
async application for hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/67#discussion_r313184179
 
 

 ##
 File path: src/main/config/wikipedia-async-application.properties
 ##
 @@ -0,0 +1,58 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Job
+job.name=wikipedia-async-application
+job.coordinator.factory=org.apache.samza.zk.ZkJobCoordinatorFactory
+job.default.system=kafka
+job.coordinator.zk.connect=localhost:2181
+
+# Task/Application
+task.name.grouper.factory=org.apache.samza.container.grouper.task.GroupByContainerIdsFactory
+
+# Serializers
+serializers.registry.string.class=org.apache.samza.serializers.StringSerdeFactory
+serializers.registry.integer.class=org.apache.samza.serializers.IntegerSerdeFactory
+
+# Wikipedia System
+systems.wikipedia.samza.factory=samza.examples.wikipedia.system.WikipediaSystemFactory
+systems.wikipedia.host=irc.wikimedia.org
+systems.wikipedia.port=6667
+
+# Kafka System
+systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory
+systems.kafka.consumer.zookeeper.connect=localhost:2181/
+systems.kafka.producer.bootstrap.servers=localhost:9092
+systems.kafka.default.stream.replication.factor=1
+
+# Streams
+streams.en-wikipedia.samza.system=wikipedia
+streams.en-wikipedia.samza.physical.name=#en.wikipedia
+
+streams.en-wiktionary.samza.system=wikipedia
+streams.en-wiktionary.samza.physical.name=#en.wiktionary
+
+streams.en-wikinews.samza.system=wikipedia
+streams.en-wikinews.samza.physical.name=#en.wikinews
+
+task.max.concurrency=20
+
+app.class=samza.examples.wikipedia.application.WikipediaAsyncApplication
+job.factory.class=org.apache.samza.job.yarn.YarnJobFactory
+job.container.count=1
+
+yarn.package.path=file:///Users/bkumaras/workspace-common/hello-samza/target/hello-samza-1.0.1-SNAPSHOT-dist.tar.gz
 
 Review comment:
   Please don't include your specific user workspace.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] mynameborat opened a new pull request #67: SAMZA-2285 - Sample async application for hello-samza

2019-07-30 Thread GitBox
mynameborat opened a new pull request #67: SAMZA-2285 - Sample async 
application for hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/67
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] rmatharu closed pull request #60: Updating after Bharath's scala 2.11 change, upgrading YARN version to match samza's yarn version

2019-06-21 Thread GitBox
rmatharu closed pull request #60: Updating after Bharath's scala 2.11 change, 
upgrading YARN version to match samza's yarn version
URL: https://github.com/apache/samza-hello-samza/pull/60
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] rmatharu commented on issue #62: Adding support for lxc on yarn for Samza

2019-06-19 Thread GitBox
rmatharu commented on issue #62: Adding support for lxc on yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#issuecomment-503661672
 
 
   Address most comments. 
   Have put the contents of this in a gist here: 
https://gist.github.com/rmatharu/5d09e942aa7c38c14c5ff79283afc06e
   Can re-open if it needs to be checked in.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] rmatharu commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-18 Thread GitBox
rmatharu commented on a change in pull request #62: Adding support for lxc on 
yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r295119154
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
+{
+local  ip=$1
+local  stat=1
+
+if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
+OIFS=$IFS
+IFS='.'
+ip=($ip)
+IFS=$OIFS
+[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \
+&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]]
+stat=$?
+fi
+return $stat
+}
+
+function check_OS() 
+{
+   #Check if OS is linux
+   if [[ "$OSTYPE" == "linux-gnu" ]]; then
+   echo "OS check passed."
+   else
+   echo "Only RHEL-Linux is currently supported for this setup. 
Exiting ..."
+   exit 0
+   fi
+}
+
+
+function lxc_setup() 
+{
+
+   #Install LXC (and its dependencies)
+   echo "Beginning installation. Installing lxc on your machine"
+   sudo yum -y install epel-release
+   sudo  yum -y install lxc lxc-templates libcap-devel libcgroup wget 
bridge-utils lxc-extra --skip-broken
+   echo "LXC installation complete."
+
+
+   lxcInterface=""
+   gatewayIP=""
+
+for interface in ${POSSIBLE_LXC_INTERFACES[@]}
+do
+echo "Checking if $interface is valid"
+ip_address=`ip addr show $interface | grep "inet\b" | awk 
'{print $2}' | cut -d/ -f1`
+
+if valid_ip $ip_address; then
+echo "Interface $interface is valid for using with LXC 
instances."
+lxcInterface=$interface
+gatewayIP=$ip_address
+break;
+else
+echo "Interface $interface does not appear to be 
valid."
+continue;
+fi
+done
+
+   if [[ -z "$lxcInterface" ]]; then 
+   echo "Did not find a valid network interface for use with LXC. 
Install LXC manually (https://linuxcontainers.org/lxc/getting-started/) and 
re-run."
+   exit 0
+   fi
+
+   #Print the valid interface found
+   echo "Using interface "$lxcInterface "($gatewayIP) for use with LXC"
+
+   # Create shared directory for sharing between base machine and 
LXC-instances
+   echo "Creating dir $SHARED_LXC_DIR to be shared between base machine 
and LXC-instances"
+   sudo mkdir -p $SHARED_LXC_DIR && sudo chmod 777 $SHARED_LXC_DIR
+
+   # Setting gateway IP address in conf/yarn-site.xml 
+   echo "Setting yarn.resourcemanager.hostname="$gatewayIP in 
$YARN_SITE_XML 
+   sed -i 
"/yarn.resourcemanager.hostname<\/name>/!b;n;c$gatewayIP" 
$YARN_SITE_XML
+
+   # Adding RM bind host in conf/yarn-site.xml
+   echo "Setting yarn.resourcemanager.bind-host=0.0.0.0" in $YARN_SITE_XML
+   if [[ ! -z $(grep "yarn.resourcemanager.bind-host" conf/yarn-site.xml) 
]]; then 
+
+   # Setting yarn.resourcemanager.bind-host to 0.0.0.0
+   sed -i 
"/yarn.resourcemanager.bind-host<\/name>/!b;n;c0.0.0.0" 
$YARN_SITE_XML
+   else 
+
+   # Appending RM bind host in conf/yarn-site.xml
+   sed -i 

[GitHub] [samza-hello-samza] rmatharu commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-18 Thread GitBox
rmatharu commented on a change in pull request #62: Adding support for lxc on 
yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r295118327
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
 
 Review comment:
   bash doesnt allow that


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-12 Thread GitBox
vjagadish1989 commented on a change in pull request #62: Adding support for lxc 
on yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293135338
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
+{
+local  ip=$1
+local  stat=1
+
+if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
+OIFS=$IFS
+IFS='.'
+ip=($ip)
+IFS=$OIFS
+[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \
+&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]]
+stat=$?
+fi
+return $stat
+}
+
+function check_OS() 
+{
+   #Check if OS is linux
+   if [[ "$OSTYPE" == "linux-gnu" ]]; then
+   echo "OS check passed."
+   else
+   echo "Only RHEL-Linux is currently supported for this setup. 
Exiting ..."
+   exit 0
+   fi
+}
+
+
+function lxc_setup() 
+{
+
+   #Install LXC (and its dependencies)
+   echo "Beginning installation. Installing lxc on your machine"
+   sudo yum -y install epel-release
+   sudo  yum -y install lxc lxc-templates libcap-devel libcgroup wget 
bridge-utils lxc-extra --skip-broken
+   echo "LXC installation complete."
+
+
+   lxcInterface=""
+   gatewayIP=""
+
+for interface in ${POSSIBLE_LXC_INTERFACES[@]}
+do
+echo "Checking if $interface is valid"
+ip_address=`ip addr show $interface | grep "inet\b" | awk 
'{print $2}' | cut -d/ -f1`
+
+if valid_ip $ip_address; then
+echo "Interface $interface is valid for using with LXC 
instances."
+lxcInterface=$interface
+gatewayIP=$ip_address
+break;
+else
+echo "Interface $interface does not appear to be 
valid."
+continue;
+fi
+done
+
+   if [[ -z "$lxcInterface" ]]; then 
+   echo "Did not find a valid network interface for use with LXC. 
Install LXC manually (https://linuxcontainers.org/lxc/getting-started/) and 
re-run."
+   exit 0
+   fi
+
+   #Print the valid interface found
+   echo "Using interface "$lxcInterface "($gatewayIP) for use with LXC"
+
+   # Create shared directory for sharing between base machine and 
LXC-instances
+   echo "Creating dir $SHARED_LXC_DIR to be shared between base machine 
and LXC-instances"
+   sudo mkdir -p $SHARED_LXC_DIR && sudo chmod 777 $SHARED_LXC_DIR
+
+   # Setting gateway IP address in conf/yarn-site.xml 
+   echo "Setting yarn.resourcemanager.hostname="$gatewayIP in 
$YARN_SITE_XML 
+   sed -i 
"/yarn.resourcemanager.hostname<\/name>/!b;n;c$gatewayIP" 
$YARN_SITE_XML
+
+   # Adding RM bind host in conf/yarn-site.xml
+   echo "Setting yarn.resourcemanager.bind-host=0.0.0.0" in $YARN_SITE_XML
+   if [[ ! -z $(grep "yarn.resourcemanager.bind-host" conf/yarn-site.xml) 
]]; then 
+
+   # Setting yarn.resourcemanager.bind-host to 0.0.0.0
+   sed -i 
"/yarn.resourcemanager.bind-host<\/name>/!b;n;c0.0.0.0" 
$YARN_SITE_XML
+   else 
+
+   # Appending RM bind host in conf/yarn-site.xml
+   sed -i 

[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-12 Thread GitBox
vjagadish1989 commented on a change in pull request #62: Adding support for lxc 
on yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293126356
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
+{
 
 Review comment:
   format functions consistently as rest of the files:
   - eg: open parenthesis on the same line as the declaration, 2 space indents 
everywhere


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-12 Thread GitBox
vjagadish1989 commented on a change in pull request #62: Adding support for lxc 
on yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293128829
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
+{
+local  ip=$1
+local  stat=1
+
+if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
+OIFS=$IFS
+IFS='.'
+ip=($ip)
+IFS=$OIFS
+[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \
+&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]]
+stat=$?
+fi
+return $stat
+}
+
+function check_OS() 
+{
+   #Check if OS is linux
+   if [[ "$OSTYPE" == "linux-gnu" ]]; then
+   echo "OS check passed."
+   else
+   echo "Only RHEL-Linux is currently supported for this setup. 
Exiting ..."
+   exit 0
+   fi
+}
+
+
+function lxc_setup() 
+{
+
+   #Install LXC (and its dependencies)
+   echo "Beginning installation. Installing lxc on your machine"
+   sudo yum -y install epel-release
+   sudo  yum -y install lxc lxc-templates libcap-devel libcgroup wget 
bridge-utils lxc-extra --skip-broken
+   echo "LXC installation complete."
+
+
+   lxcInterface=""
+   gatewayIP=""
+
+for interface in ${POSSIBLE_LXC_INTERFACES[@]}
+do
+echo "Checking if $interface is valid"
+ip_address=`ip addr show $interface | grep "inet\b" | awk 
'{print $2}' | cut -d/ -f1`
+
+if valid_ip $ip_address; then
+echo "Interface $interface is valid for using with LXC 
instances."
+lxcInterface=$interface
+gatewayIP=$ip_address
+break;
+else
+echo "Interface $interface does not appear to be 
valid."
+continue;
+fi
+done
+
+   if [[ -z "$lxcInterface" ]]; then 
+   echo "Did not find a valid network interface for use with LXC. 
Install LXC manually (https://linuxcontainers.org/lxc/getting-started/) and 
re-run."
+   exit 0
 
 Review comment:
   exit with non-zero code since this is an unexpected result


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-12 Thread GitBox
vjagadish1989 commented on a change in pull request #62: Adding support for lxc 
on yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293128724
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
+{
+local  ip=$1
+local  stat=1
+
+if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
+OIFS=$IFS
+IFS='.'
+ip=($ip)
+IFS=$OIFS
+[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \
+&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]]
+stat=$?
+fi
+return $stat
+}
+
+function check_OS() 
+{
+   #Check if OS is linux
+   if [[ "$OSTYPE" == "linux-gnu" ]]; then
+   echo "OS check passed."
+   else
+   echo "Only RHEL-Linux is currently supported for this setup. 
Exiting ..."
+   exit 0
+   fi
+}
+
+
+function lxc_setup() 
+{
+
+   #Install LXC (and its dependencies)
+   echo "Beginning installation. Installing lxc on your machine"
+   sudo yum -y install epel-release
+   sudo  yum -y install lxc lxc-templates libcap-devel libcgroup wget 
bridge-utils lxc-extra --skip-broken
+   echo "LXC installation complete."
+
+
+   lxcInterface=""
+   gatewayIP=""
+
+for interface in ${POSSIBLE_LXC_INTERFACES[@]}
+do
+echo "Checking if $interface is valid"
+ip_address=`ip addr show $interface | grep "inet\b" | awk 
'{print $2}' | cut -d/ -f1`
+
+if valid_ip $ip_address; then
+echo "Interface $interface is valid for using with LXC 
instances."
+lxcInterface=$interface
+gatewayIP=$ip_address
+break;
+else
+echo "Interface $interface does not appear to be 
valid."
+continue;
 
 Review comment:
   this loop will continue anyways to check for the next interface. don't think 
you need a continue here


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-12 Thread GitBox
vjagadish1989 commented on a change in pull request #62: Adding support for lxc 
on yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293127785
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
+{
+local  ip=$1
+local  stat=1
+
+if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
+OIFS=$IFS
+IFS='.'
+ip=($ip)
+IFS=$OIFS
+[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \
+&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]]
+stat=$?
+fi
+return $stat
+}
+
+function check_OS() 
+{
+   #Check if OS is linux
+   if [[ "$OSTYPE" == "linux-gnu" ]]; then
+   echo "OS check passed."
+   else
+   echo "Only RHEL-Linux is currently supported for this setup. 
Exiting ..."
+   exit 0
+   fi
+}
+
+
+function lxc_setup() 
+{
+
+   #Install LXC (and its dependencies)
+   echo "Beginning installation. Installing lxc on your machine"
+   sudo yum -y install epel-release
+   sudo  yum -y install lxc lxc-templates libcap-devel libcgroup wget 
bridge-utils lxc-extra --skip-broken
+   echo "LXC installation complete."
 
 Review comment:
   capitalize consistently lxc vs LXC


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-12 Thread GitBox
vjagadish1989 commented on a change in pull request #62: Adding support for lxc 
on yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293127513
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
+{
+local  ip=$1
+local  stat=1
+
+if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
+OIFS=$IFS
+IFS='.'
+ip=($ip)
+IFS=$OIFS
+[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \
+&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]]
+stat=$?
+fi
+return $stat
+}
+
+function check_OS() 
+{
+   #Check if OS is linux
+   if [[ "$OSTYPE" == "linux-gnu" ]]; then
+   echo "OS check passed."
+   else
+   echo "Only RHEL-Linux is currently supported for this setup. 
Exiting ..."
+   exit 0
+   fi
+}
+
+
+function lxc_setup() 
 
 Review comment:
   prefer functions to start with verbs eg: setup_lxc


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-12 Thread GitBox
vjagadish1989 commented on a change in pull request #62: Adding support for lxc 
on yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293128114
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
+{
+local  ip=$1
+local  stat=1
+
+if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
+OIFS=$IFS
+IFS='.'
+ip=($ip)
+IFS=$OIFS
+[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \
+&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]]
+stat=$?
+fi
+return $stat
+}
+
+function check_OS() 
+{
+   #Check if OS is linux
+   if [[ "$OSTYPE" == "linux-gnu" ]]; then
+   echo "OS check passed."
+   else
+   echo "Only RHEL-Linux is currently supported for this setup. 
Exiting ..."
+   exit 0
+   fi
+}
+
+
+function lxc_setup() 
+{
+
+   #Install LXC (and its dependencies)
+   echo "Beginning installation. Installing lxc on your machine"
+   sudo yum -y install epel-release
+   sudo  yum -y install lxc lxc-templates libcap-devel libcgroup wget 
bridge-utils lxc-extra --skip-broken
+   echo "LXC installation complete."
+
+
+   lxcInterface=""
 
 Review comment:
   iirc, camel-case convention for variable names is java-only. use 
under-scores here eg: lxc_interface


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-12 Thread GitBox
vjagadish1989 commented on a change in pull request #62: Adding support for lxc 
on yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293131208
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
+{
+local  ip=$1
+local  stat=1
+
+if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
+OIFS=$IFS
+IFS='.'
+ip=($ip)
+IFS=$OIFS
+[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \
+&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]]
+stat=$?
+fi
+return $stat
+}
+
+function check_OS() 
+{
+   #Check if OS is linux
+   if [[ "$OSTYPE" == "linux-gnu" ]]; then
+   echo "OS check passed."
+   else
+   echo "Only RHEL-Linux is currently supported for this setup. 
Exiting ..."
+   exit 0
+   fi
+}
+
+
+function lxc_setup() 
+{
+
+   #Install LXC (and its dependencies)
+   echo "Beginning installation. Installing lxc on your machine"
+   sudo yum -y install epel-release
+   sudo  yum -y install lxc lxc-templates libcap-devel libcgroup wget 
bridge-utils lxc-extra --skip-broken
+   echo "LXC installation complete."
+
+
+   lxcInterface=""
+   gatewayIP=""
+
+for interface in ${POSSIBLE_LXC_INTERFACES[@]}
+do
+echo "Checking if $interface is valid"
+ip_address=`ip addr show $interface | grep "inet\b" | awk 
'{print $2}' | cut -d/ -f1`
+
+if valid_ip $ip_address; then
+echo "Interface $interface is valid for using with LXC 
instances."
+lxcInterface=$interface
+gatewayIP=$ip_address
+break;
+else
+echo "Interface $interface does not appear to be 
valid."
+continue;
+fi
+done
+
+   if [[ -z "$lxcInterface" ]]; then 
+   echo "Did not find a valid network interface for use with LXC. 
Install LXC manually (https://linuxcontainers.org/lxc/getting-started/) and 
re-run."
+   exit 0
+   fi
+
+   #Print the valid interface found
+   echo "Using interface "$lxcInterface "($gatewayIP) for use with LXC"
+
+   # Create shared directory for sharing between base machine and 
LXC-instances
+   echo "Creating dir $SHARED_LXC_DIR to be shared between base machine 
and LXC-instances"
+   sudo mkdir -p $SHARED_LXC_DIR && sudo chmod 777 $SHARED_LXC_DIR
+
+   # Setting gateway IP address in conf/yarn-site.xml 
+   echo "Setting yarn.resourcemanager.hostname="$gatewayIP in 
$YARN_SITE_XML 
+   sed -i 
"/yarn.resourcemanager.hostname<\/name>/!b;n;c$gatewayIP" 
$YARN_SITE_XML
+
+   # Adding RM bind host in conf/yarn-site.xml
+   echo "Setting yarn.resourcemanager.bind-host=0.0.0.0" in $YARN_SITE_XML
+   if [[ ! -z $(grep "yarn.resourcemanager.bind-host" conf/yarn-site.xml) 
]]; then 
+
+   # Setting yarn.resourcemanager.bind-host to 0.0.0.0
+   sed -i 
"/yarn.resourcemanager.bind-host<\/name>/!b;n;c0.0.0.0" 
$YARN_SITE_XML
+   else 
+
+   # Appending RM bind host in conf/yarn-site.xml
+   sed -i 

[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-12 Thread GitBox
vjagadish1989 commented on a change in pull request #62: Adding support for lxc 
on yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293127099
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
+{
+local  ip=$1
+local  stat=1
+
+if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
+OIFS=$IFS
+IFS='.'
+ip=($ip)
+IFS=$OIFS
+[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \
+&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]]
+stat=$?
+fi
+return $stat
+}
+
+function check_OS() 
+{
+   #Check if OS is linux
+   if [[ "$OSTYPE" == "linux-gnu" ]]; then
+   echo "OS check passed."
+   else
+   echo "Only RHEL-Linux is currently supported for this setup. 
Exiting ..."
+   exit 0
 
 Review comment:
   should you exit with a non-zero code?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-12 Thread GitBox
vjagadish1989 commented on a change in pull request #62: Adding support for lxc 
on yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293130994
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
+{
+local  ip=$1
+local  stat=1
+
+if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
+OIFS=$IFS
+IFS='.'
+ip=($ip)
+IFS=$OIFS
+[[ ${ip[0]} -le 255 && ${ip[1]} -le 255 \
+&& ${ip[2]} -le 255 && ${ip[3]} -le 255 ]]
+stat=$?
+fi
+return $stat
+}
+
+function check_OS() 
+{
+   #Check if OS is linux
+   if [[ "$OSTYPE" == "linux-gnu" ]]; then
+   echo "OS check passed."
+   else
+   echo "Only RHEL-Linux is currently supported for this setup. 
Exiting ..."
+   exit 0
+   fi
+}
+
+
+function lxc_setup() 
+{
+
+   #Install LXC (and its dependencies)
+   echo "Beginning installation. Installing lxc on your machine"
+   sudo yum -y install epel-release
+   sudo  yum -y install lxc lxc-templates libcap-devel libcgroup wget 
bridge-utils lxc-extra --skip-broken
+   echo "LXC installation complete."
+
+
+   lxcInterface=""
+   gatewayIP=""
+
+for interface in ${POSSIBLE_LXC_INTERFACES[@]}
+do
+echo "Checking if $interface is valid"
+ip_address=`ip addr show $interface | grep "inet\b" | awk 
'{print $2}' | cut -d/ -f1`
+
+if valid_ip $ip_address; then
+echo "Interface $interface is valid for using with LXC 
instances."
+lxcInterface=$interface
+gatewayIP=$ip_address
+break;
+else
+echo "Interface $interface does not appear to be 
valid."
+continue;
+fi
+done
+
+   if [[ -z "$lxcInterface" ]]; then 
+   echo "Did not find a valid network interface for use with LXC. 
Install LXC manually (https://linuxcontainers.org/lxc/getting-started/) and 
re-run."
+   exit 0
+   fi
+
+   #Print the valid interface found
+   echo "Using interface "$lxcInterface "($gatewayIP) for use with LXC"
+
+   # Create shared directory for sharing between base machine and 
LXC-instances
+   echo "Creating dir $SHARED_LXC_DIR to be shared between base machine 
and LXC-instances"
+   sudo mkdir -p $SHARED_LXC_DIR && sudo chmod 777 $SHARED_LXC_DIR
+
+   # Setting gateway IP address in conf/yarn-site.xml 
+   echo "Setting yarn.resourcemanager.hostname="$gatewayIP in 
$YARN_SITE_XML 
+   sed -i 
"/yarn.resourcemanager.hostname<\/name>/!b;n;c$gatewayIP" 
$YARN_SITE_XML
+
+   # Adding RM bind host in conf/yarn-site.xml
+   echo "Setting yarn.resourcemanager.bind-host=0.0.0.0" in $YARN_SITE_XML
+   if [[ ! -z $(grep "yarn.resourcemanager.bind-host" conf/yarn-site.xml) 
]]; then 
+
+   # Setting yarn.resourcemanager.bind-host to 0.0.0.0
+   sed -i 
"/yarn.resourcemanager.bind-host<\/name>/!b;n;c0.0.0.0" 
$YARN_SITE_XML
+   else 
+
+   # Appending RM bind host in conf/yarn-site.xml
+   sed -i 

[GitHub] [samza-hello-samza] vjagadish1989 commented on a change in pull request #62: Adding support for lxc on yarn for Samza

2019-06-12 Thread GitBox
vjagadish1989 commented on a change in pull request #62: Adding support for lxc 
on yarn for Samza
URL: https://github.com/apache/samza-hello-samza/pull/62#discussion_r293127151
 
 

 ##
 File path: bin/setup-lxc
 ##
 @@ -0,0 +1,345 @@
+#!/bin/bash -e
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script will download, setup, start, and stop servers for Kafka, YARN, 
and ZooKeeper,
+# as well as downloading, building and locally publishing Samza
+
+
+COMMAND=$1
+ARG0=$2
+ARG1=$3
+
+SHARED_LXC_DIR=/lxc-shared
+POSSIBLE_LXC_INTERFACES=( virbr0 lxcbr0)
+YARN_SITE_XML=conf/yarn-site.xml
+NM_LIVENESS_MS=1 #value of the yarn.nm.liveness-monitor.expiry-interval-ms 
variable
+LXC_INSTANCE_TYPE="fedora"
+LXC_ROOTFS_DIR=/var/lib/lxc
+LXC_INSTANCE_START_NM_SCRIPT=startNodeManager
+
+RESOLV_CONF_FILE=/etc/resolv.conf
+
+# Helper function to test an IP address for validity:
+# Usage:
+#  valid_ip IP_ADDRESS
+#  if [[ $? -eq 0 ]]; then echo good; else echo bad; fi
+#   OR
+#  if valid_ip IP_ADDRESS; then echo good; else echo bad; fi
+#
+function valid_ip()
 
 Review comment:
   move helper to end of file


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] sborya merged pull request #66: update to samza 1.2.0

2019-06-07 Thread GitBox
sborya merged pull request #66: update to samza 1.2.0
URL: https://github.com/apache/samza-hello-samza/pull/66
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] sborya opened a new pull request #66: update to samza 1.2.0

2019-06-07 Thread GitBox
sborya opened a new pull request #66: update to samza 1.2.0
URL: https://github.com/apache/samza-hello-samza/pull/66
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] weisong44 merged pull request #65: SAMZA-2223: Update Couchbase example to use NoOpTableReadFunction

2019-05-28 Thread GitBox
weisong44 merged pull request #65: SAMZA-2223: Update Couchbase example to use 
NoOpTableReadFunction
URL: https://github.com/apache/samza-hello-samza/pull/65
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] weisong44 opened a new pull request #65: SAMZA-2223: Update Couchbase example to use NoOpTableReadFunction

2019-05-28 Thread GitBox
weisong44 opened a new pull request #65: SAMZA-2223: Update Couchbase example 
to use NoOpTableReadFunction
URL: https://github.com/apache/samza-hello-samza/pull/65
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] weisong44 merged pull request #64: SAMZA-2218: Add a Couchbase example to samza-hello-samza

2019-05-28 Thread GitBox
weisong44 merged pull request #64: SAMZA-2218: Add a Couchbase example to 
samza-hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/64
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] weisong44 commented on a change in pull request #64: SAMZA-2218: Add a Couchbase example to samza-hello-samza

2019-05-24 Thread GitBox
weisong44 commented on a change in pull request #64: SAMZA-2218: Add a 
Couchbase example to samza-hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/64#discussion_r287457502
 
 

 ##
 File path: src/main/java/samza/examples/cookbook/CouchbaseTableExample.java
 ##
 @@ -0,0 +1,259 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package samza.examples.cookbook;
+
+import com.couchbase.client.java.document.json.JsonObject;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.text.SimpleDateFormat;
+import java.time.Duration;
+import java.util.Arrays;
+import java.util.Date;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.TimeUnit;
+import org.apache.samza.SamzaException;
+import org.apache.samza.application.StreamApplication;
+import org.apache.samza.application.descriptors.StreamApplicationDescriptor;
+import org.apache.samza.context.Context;
+import org.apache.samza.operators.MessageStream;
+import org.apache.samza.operators.OutputStream;
+import org.apache.samza.operators.functions.MapFunction;
+import org.apache.samza.serializers.StringSerde;
+import org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor;
+import org.apache.samza.system.kafka.descriptors.KafkaOutputDescriptor;
+import org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor;
+import org.apache.samza.table.descriptors.RemoteTableDescriptor;
+import org.apache.samza.table.remote.BaseTableFunction;
+import org.apache.samza.table.remote.RemoteTable;
+import org.apache.samza.table.remote.TableReadFunction;
+import org.apache.samza.table.remote.couchbase.CouchbaseTableWriteFunction;
+import org.apache.samza.table.retry.TableRetryPolicy;
+
+
+/**
+ * This is a simple word count example using a remote store.
+ *
+ * In this example, we use Couchbase to demonstrate how to invoke API's on a 
remote store other than get, put or delete
+ * as defined in {@link org.apache.samza.table.remote.AsyncRemoteTable}. Input 
messages are collected from user through
+ * a Kafka console producer, and tokenized using space. For each word, we 
increment a counter for this word
+ * as well as a counter for all words on Couchbase. We also output the current 
value of both counters to Kafka console
+ * consumer.
+ *
+ * A rate limit of 4 requests/second to Couchbase is set of the entire job, 
internally Samza uses an embedded
+ * rate limiter, which evenly distributes the total rate limit among tasks. As 
we invoke 2 calls on Couchbase
+ * for each word, you should see roughly 2 messages per second in the Kafka 
console consumer
+ * window.
+ *
+ * A retry policy with 1 second fixed backoff time and max 3 retries is 
attached to the remote table.
+ *
+ *  Concepts covered: remote table, rate limiter, retry, arbitrary 
operation on remote store.
+ *
+ * To run the below example:
+ *
+ * 
+ *   
+ * Create a Couchbase instance using docker; Log into the admin UI at 
http://localhost:8091 (Administrator/password) 
+ * create a bucket called "my-bucket" 
+ * Under Security tab, create a user with the same name, set 123456 as the 
password, and give it "Data Reader"
+ * and "Data Writer" privilege for this bucket. 
+ * More information can be found at 
https://docs.couchbase.com/server/current/getting-started/do-a-quick-install.html
+ *   
+ *   
+ * Create Kafka topics "word-input" and "count-output" 
+ * ./deploy/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create 
--topic word-input --partitions 2 --replication-factor 1
+ * ./deploy/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create 
--topic count-output --partitions 2 --replication-factor 1
+ *   
+ *   
+ * Run the application using the run-app.sh script 
+ * ./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/couchbase-table-example.properties
+ *   
+ *   
+ * Consume messages from the output topic 
+ * ./deploy/kafka/bin/kafka-console-consumer.sh 

[GitHub] [samza-hello-samza] dengpanyin commented on a change in pull request #64: SAMZA-2218: Add a Couchbase example to samza-hello-samza

2019-05-24 Thread GitBox
dengpanyin commented on a change in pull request #64: SAMZA-2218: Add a 
Couchbase example to samza-hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/64#discussion_r287431784
 
 

 ##
 File path: src/main/java/samza/examples/cookbook/CouchbaseTableExample.java
 ##
 @@ -0,0 +1,259 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package samza.examples.cookbook;
+
+import com.couchbase.client.java.document.json.JsonObject;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.text.SimpleDateFormat;
+import java.time.Duration;
+import java.util.Arrays;
+import java.util.Date;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.TimeUnit;
+import org.apache.samza.SamzaException;
+import org.apache.samza.application.StreamApplication;
+import org.apache.samza.application.descriptors.StreamApplicationDescriptor;
+import org.apache.samza.context.Context;
+import org.apache.samza.operators.MessageStream;
+import org.apache.samza.operators.OutputStream;
+import org.apache.samza.operators.functions.MapFunction;
+import org.apache.samza.serializers.StringSerde;
+import org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor;
+import org.apache.samza.system.kafka.descriptors.KafkaOutputDescriptor;
+import org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor;
+import org.apache.samza.table.descriptors.RemoteTableDescriptor;
+import org.apache.samza.table.remote.BaseTableFunction;
+import org.apache.samza.table.remote.RemoteTable;
+import org.apache.samza.table.remote.TableReadFunction;
+import org.apache.samza.table.remote.couchbase.CouchbaseTableWriteFunction;
+import org.apache.samza.table.retry.TableRetryPolicy;
+
+
+/**
+ * This is a simple word count example using a remote store.
+ *
+ * In this example, we use Couchbase to demonstrate how to invoke API's on a 
remote store other than get, put or delete
+ * as defined in {@link org.apache.samza.table.remote.AsyncRemoteTable}. Input 
messages are collected from user through
+ * a Kafka console producer, and tokenized using space. For each word, we 
increment a counter for this word
+ * as well as a counter for all words on Couchbase. We also output the current 
value of both counters to Kafka console
+ * consumer.
+ *
+ * A rate limit of 4 requests/second to Couchbase is set of the entire job, 
internally Samza uses an embedded
+ * rate limiter, which evenly distributes the total rate limit among tasks. As 
we invoke 2 calls on Couchbase
+ * for each word, you should see roughly 2 messages per second in the Kafka 
console consumer
+ * window.
+ *
+ * A retry policy with 1 second fixed backoff time and max 3 retries is 
attached to the remote table.
+ *
+ *  Concepts covered: remote table, rate limiter, retry, arbitrary 
operation on remote store.
+ *
+ * To run the below example:
+ *
+ * 
+ *   
+ * Create a Couchbase instance using docker; Log into the admin UI at 
http://localhost:8091 (Administrator/password) 
+ * create a bucket called "my-bucket" 
+ * Under Security tab, create a user with the same name, set 123456 as the 
password, and give it "Data Reader"
+ * and "Data Writer" privilege for this bucket. 
+ * More information can be found at 
https://docs.couchbase.com/server/current/getting-started/do-a-quick-install.html
+ *   
+ *   
+ * Create Kafka topics "word-input" and "count-output" 
+ * ./deploy/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create 
--topic word-input --partitions 2 --replication-factor 1
+ * ./deploy/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create 
--topic count-output --partitions 2 --replication-factor 1
+ *   
+ *   
+ * Run the application using the run-app.sh script 
+ * ./deploy/samza/bin/run-app.sh 
--config-factory=org.apache.samza.config.factories.PropertiesConfigFactory 
--config-path=file://$PWD/deploy/samza/config/couchbase-table-example.properties
+ *   
+ *   
+ * Consume messages from the output topic 
+ * ./deploy/kafka/bin/kafka-console-consumer.sh 

[GitHub] [samza-hello-samza] weisong44 opened a new pull request #64: SAMZA-2218: Add a Couchbase example to samza-hello-samza

2019-05-23 Thread GitBox
weisong44 opened a new pull request #64: SAMZA-2218: Add a Couchbase example to 
samza-hello-samza
URL: https://github.com/apache/samza-hello-samza/pull/64
 
 
   As per subject


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] weisong44 merged pull request #63: Fixed build failure due to changes in Samza project

2019-05-22 Thread GitBox
weisong44 merged pull request #63: Fixed build failure due to changes in Samza 
project
URL: https://github.com/apache/samza-hello-samza/pull/63
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza-hello-samza] weisong44 opened a new pull request #63: Fixed build failure due to changes in Samza project

2019-05-22 Thread GitBox
weisong44 opened a new pull request #63: Fixed build failure due to changes in 
Samza project
URL: https://github.com/apache/samza-hello-samza/pull/63
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   3   4   5   6   7   8   9   10   >