TL;DR I’ve found a way to dramatically reduce barriers to using streams as a beginner.
Using the streams 0.3 release, it’s quite a headache for a novice to use streams. We have a tutorial on the website, but it’s quite a journey. You have to check out all three repos and install them each in order before you get a jar file you could use to get data, then you can run a few pre-canned streams, and those are intermediate not beginner level. In an ideal world, anyone would be able to yum or apt-get (or docker pull) individual providers or processors and run them on their own without building from source or composing them into multi-step streams. We'd have increase our build and compliance complexity significantly to publish official binaries. So what can we do to drop the learning curve precipitously without doing that? Providers are really simple to run. The hard part is getting all of the right classes and configuration properties into a JVM. Inspired by how zeppelin’s %dep interpreter reduces the friction in composing and running a scala notebook, I wanted to find a way to get the same ability from a linux shell. The commands below go from just a java installation to flat files of twitter data in just a few minutes. I think until we have binary distributions, this is how our tutorials should tell the world to get started with streams. Thoughts? ----- # install sbtx curl -s https://raw.githubusercontent.com/paulp/sbt-extras/master/sbt > /usr/bin/sbtx && chmod 0755 /usr/bin/sbtx # create a workspace mkdir twitter-test; cd twitter-test; # supply a config file with credentials cat > application.conf << EOF twitter { oauth { consumerKey = "" consumerSecret = "" accessToken = "" accessTokenSecret = "" } retrySleepMs = 5000 retryMax = 250 info = [ 18055613 ] } EOF sbtx -210 -sbt-create set resolvers += "Local Maven Repository" at "file://"+Path.userHome.absolutePath+"/.m2/repository" set libraryDependencies += "org.apache.streams" % "streams-provider-twitter" % "0.4-incubating-SNAPSHOT" set fork := true run-main org.apache.streams.twitter.provider.TwitterUserInformationProvider application.conf users.txt run-main org.apache.streams.twitter.provider.TwitterTimelineProvider application.conf statuses.txt set javaOptions += "-Dtwitter.endpoint=friends" run-main org.apache.streams.twitter.provider.TwitterFollowingProvider application.conf friends.txt set javaOptions += "-Dtwitter.endpoint=followers" exit ls -l Steves-MacBook-Pro-3:twitter sblackmon$ ls -l -rw-r--r--@ 1 sblackmon staff 356 Oct 6 11:54 application.conf -rw-r--r-- 1 sblackmon staff 293780 Oct 6 13:42 followers.txt -rw-r--r-- 1 sblackmon staff 6260 Oct 6 13:43 friends.txt drwxr-xr-x 3 sblackmon staff 102 Oct 6 10:17 project -rw-r--r-- 1 sblackmon staff 3339460 Oct 6 13:43 statuses.txt drwxr-xr-x 6 sblackmon staff 204 Oct 6 10:19 target -rw-r--r-- 1 sblackmon staff 3321 Oct 6 13:43 users.txt