On Mon, Oct 17, 2016 at 11:36 AM sblackmon <sblack...@apache.org> wrote:
> > On October 11, 2016 at 11:01:18 AM, Matt Franklin ( > m.ben.frank...@gmail.com) wrote: > > On Mon, Oct 10, 2016 at 11:30 AM sblackmon <sblack...@apache.org> wrote: > > > Some other projects are currently looking at publishing docker > containers > > that people can easily extend. I am totally in favor of this approach. > > > > > > Docker distribution would open up a lot of cool options for this > project. > > > > Which projects are farthest along this road? > > > > https://hub.docker.com/r/apache/ > > > I had been thinking more along the lines of publishing a distribution for > each provider, processor, and persister module containing a minimal > uber-jar. Going this route would probably warrant a dedicated organization > for streams. OTOH, if we get to the point of having a binary distribution > containing all of the classes in streams-project, that could be published > to a top-level /apache repository and perform all of the same work > (probably with a much larger docker image) > Tomcat (and I think a few others) have their own organization on Docker Hub, so it is definitely a possibility. > > > > > > I think even publishing this as a Docker file example on the website > would > > be a good start. > > > > These PRs use a maven docker plugin during verify phase. > > https://github.com/apache/incubator-streams-examples/pull/14 > > https://github.com/apache/incubator-streams/pull/288 > > > > The same plugin can build tag and deploy images with goals docker:build > > and docker:push . > > > > Per policy, the only thing that should make it to repositories like Docker > hub and Maven Central should be released convenience binaries. > > > I think the next step is to figure out what would need to happen to build, > certify, and publish a convenience binary and docker image for (initially) > just one one individual provider module in an upcoming releases. The > dependency tree for a single provider will be more tractable than for the > whole project and there’s a clear user benefit - greatly simplified project > tutorial. > I would submit an Infra ticket > > > > > > Once these merge I’ll take another pass through the examples > documentation > > and for each describe a few alternative processes (STREAMS-428) > > > > 1) Build from source, run stream from *nix shell with dist uber-jar. > > 2) Run stream with sbt interactive shell using artifacts from maven > central > > 3) Run stream with docker using artifacts from docker hub > > > > On October 10, 2016 at 8:09:45 AM, Matt Franklin ( > m.ben.frank...@gmail.com) > > wrote: > > > > On Thu, Oct 6, 2016 at 2:56 PM sblackmon <sblack...@apache.org> wrote: > > > > > > > > > > > > > > > > > > > TL;DR I’ve found a way to dramatically reduce barriers to using > streams > > as > > > > > a beginner. > > > > > > > > > > > > > > > > > > > > Using the streams 0.3 release, it’s quite a headache for a novice to > use > > > > > streams. We have a tutorial on the website, but it’s quite a journey. > You > > > > > have to check out all three repos and install them each in order > before > > you > > > > > get a jar file you could use to get data, then you can run a few > > pre-canned > > > > > streams, and those are intermediate not beginner level. > > > > > > > > > > > > > > > > > > > > In an ideal world, anyone would be able to yum or apt-get (or docker > > pull) > > > > > individual providers or processors and run them on their own without > > > > > building from source or composing them into multi-step streams. > > > > > > > > > > > > > > > > > > > > We'd have increase our build and compliance complexity significantly > to > > > > > publish official binaries. So what can we do to drop the learning > curve > > > > > precipitously without doing that? > > > > > > > > > > > > > Some other projects are currently looking at publishing docker > containers > > > > that people can easily extend. I am totally in favor of this approach. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Providers are really simple to run. The hard part is getting all of > the > > > > > right classes and configuration properties into a JVM. Inspired by how > > > > > zeppelin’s %dep interpreter reduces the friction in composing and > > running a > > > > > scala notebook, I wanted to find a way to get the same ability from a > > linux > > > > > shell. > > > > > > > > > > > > > > > > > > > > The commands below go from just a java installation to flat files of > > > > > twitter data in just a few minutes. > > > > > > > > > > > > > > > > > > > > I think until we have binary distributions, this is how our tutorials > > > > > should tell the world to get started with streams. > > > > > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > I think even publishing this as a Docker file example on the website > would > > > > be a good start. > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- > > > > > > > > > > > > > > > > > > > > # install sbtx > > > > > > > > > > > > > > > > > > > > curl -s https://raw.githubusercontent.com/paulp/sbt-extras/master/sbt > > > > > > > /usr/bin/sbtx && chmod 0755 /usr/bin/sbtx > > > > > > > > > > > > > > > > > > > > # create a workspace > > > > > > > > > > > > > > > > > > > > mkdir twitter-test; cd twitter-test; > > > > > > > > > > > > > > > > > > > > # supply a config file with credentials > > > > > > > > > > > > > > > > > > > > cat > application.conf << EOF > > > > > > > > > > twitter { > > > > > > > > > > oauth { > > > > > > > > > > consumerKey = "" > > > > > > > > > > consumerSecret = "" > > > > > > > > > > accessToken = "" > > > > > > > > > > accessTokenSecret = "" > > > > > > > > > > } > > > > > > > > > > retrySleepMs = 5000 > > > > > > > > > > retryMax = 250 > > > > > > > > > > info = [ > > > > > > > > > > 18055613 > > > > > > > > > > ] > > > > > > > > > > } > > > > > > > > > > EOF > > > > > > > > > > > > > > > > > > > > sbtx -210 -sbt-create > > > > > > > > > > > > > > > > > > > > set resolvers += "Local Maven Repository" at > > > > > "file://"+Path.userHome.absolutePath+"/.m2/repository" > > > > > > > > > > > > > > > > > > > > set libraryDependencies += "org.apache.streams" % > > > > > "streams-provider-twitter" % "0.4-incubating-SNAPSHOT" > > > > > > > > > > > > > > > > > > > > set fork := true > > > > > > > > > > > > > > > > > > > > run-main > > > > > org.apache.streams.twitter.provider.TwitterUserInformationProvider > > > > > application.conf users.txt > > > > > > > > > > > > > > > > > > > > run-main org.apache.streams.twitter.provider.TwitterTimelineProvider > > > > > application.conf statuses.txt > > > > > > > > > > > > > > > > > > > > set javaOptions += "-Dtwitter.endpoint=friends" > > > > > > > > > > > > > > > > > > > > run-main org.apache.streams.twitter.provider.TwitterFollowingProvider > > > > > application.conf friends.txt > > > > > > > > > > > > > > > > > > > > set javaOptions += "-Dtwitter.endpoint=followers" > > > > > > > > > > > > > > > > > > > > exit > > > > > > > > > > > > > > > > > > > > ls -l > > > > > > > > > > > > > > > > > > > > Steves-MacBook-Pro-3:twitter sblackmon$ ls -l > > > > > > > > > > -rw-r--r--@ 1 sblackmon staff 356 Oct 6 11:54 application.conf > > > > > > > > > > -rw-r--r-- 1 sblackmon staff 293780 Oct 6 13:42 followers.txt > > > > > > > > > > -rw-r--r-- 1 sblackmon staff 6260 Oct 6 13:43 friends.txt > > > > > > > > > > drwxr-xr-x 3 sblackmon staff 102 Oct 6 10:17 project > > > > > > > > > > -rw-r--r-- 1 sblackmon staff 3339460 Oct 6 13:43 statuses.txt > > > > > > > > > > drwxr-xr-x 6 sblackmon staff 204 Oct 6 10:19 target > > > > > > > > > > -rw-r--r-- 1 sblackmon staff 3321 Oct 6 13:43 users.txt > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >