Trevor,

Awesome, thanks for giving it a shot.  

With some recent changes we’re quite close to making data collection with 
streams providers turnkey for new users

I ran the following through my deployment of zeppelin - it should work for you 
too.  Please confirm :)

Cheers,
Steve

——

%dep
z.reset()
z.addRepo("apache-snapshots").url("https://repository.apache.org/content/repositories/snapshots";).snapshot()
z.load("org.apache.streams:streams-provider-twitter:0.4-incubating-SNAPSHOT")

import com.typesafe.config._
import org.apache.streams.config._
import org.apache.streams.core._
import java.util.Iterator
import org.apache.streams.twitter.pojo._
import org.apache.streams.twitter.provider._

val hocon = s"""
    twitter {
      oauth {
       consumerKey = ""
    consumerSecret = ""
    accessToken = ""
    accessTokenSecret = ""
      }
      retrySleepMs = 5000
  retryMax = 250
  info = [
    18055613
  ]
    }
"""

val typesafe = ConfigFactory.parseString(hocon)
val config = new 
ComponentConfigurator(classOf[TwitterUserInformationConfiguration]).detectConfiguration(typesafe,
 "twitter");
val provider = new TwitterTimelineProvider(config);
provider.prepare(null)
provider.startStream()
while(provider.isRunning())

val resultSet = provider.readCurrent()
resultSet.size()
val iterator = resultSet.iterator();
while(iterator.hasNext()) {
    val datum = iterator.next();
    println(datum.getDocument)
}
On October 14, 2016 at 8:33:55 AM, Trevor Grant (trevor.d.gr...@gmail.com) 
wrote:

I agree a minimal TTHW would be good- esp a user who is trying to create a  
hello world.  

I am a big fan of Apache Zeppelin notebooks for this sort of thing- easy to  
host and include Markdown.  

If I could get some community assistance getting myself started, I'd be  
happy to write it up.  

I need to know:  
Minimum dependencies-  
From the little work I have done so far I know this can be a murky  
subject as we migrate version. I'd prefer to do the minimal example in  
what ever version can be ran based on artifacts sitting in maven now. Happy  
to update when new version is pushed.  

Scala-  
Zeppelin is for all intents and purposes like running in the Spark/Flink  
shell. I'll need some help getting things going in this sort of env.  

If someone reading this is like "oh that's easy, here's your dependencies,  
and then run this code", that would be very helpful, I can get to writing  
right away. Otherwise I can hack it out, but again will need some support.  

tg  


Trevor Grant  
Data Scientist  
https://github.com/rawkintrevo  
http://stackexchange.com/users/3002022/rawkintrevo  
http://trevorgrant.org  

*"Fortunate is he, who is able to know the causes of things." -Virgil*  


On Tue, Oct 11, 2016 at 11:00 AM, Matt Franklin <m.ben.frank...@gmail.com>  
wrote:  

> On Mon, Oct 10, 2016 at 11:30 AM sblackmon <sblack...@apache.org> wrote:  
>  
> > Some other projects are currently looking at publishing docker containers  
> > that people can easily extend. I am totally in favor of this approach.  
> >  
> >  
> > Docker distribution would open up a lot of cool options for this project.  
> >  
> > Which projects are farthest along this road?  
> >  
>  
> https://hub.docker.com/r/apache/  
>  
>  
> >  
> > I think even publishing this as a Docker file example on the website  
> would  
> > be a good start.  
> >  
> > These PRs use a maven docker plugin during verify phase.  
> > https://github.com/apache/incubator-streams-examples/pull/14  
> > https://github.com/apache/incubator-streams/pull/288  
> >  
> > The same plugin can build tag and deploy images with goals docker:build  
> > and docker:push .  
> >  
>  
> Per policy, the only thing that should make it to repositories like Docker  
> hub and Maven Central should be released convenience binaries.  
>  
>  
> >  
> > Once these merge I’ll take another pass through the examples  
> documentation  
> > and for each describe a few alternative processes (STREAMS-428)  
> >  
> > 1) Build from source, run stream from *nix shell with dist uber-jar.  
> > 2) Run stream with sbt interactive shell using artifacts from maven  
> central  
> > 3) Run stream with docker using artifacts from docker hub  
> >  
> > On October 10, 2016 at 8:09:45 AM, Matt Franklin (  
> m.ben.frank...@gmail.com)  
> > wrote:  
> >  
> > On Thu, Oct 6, 2016 at 2:56 PM sblackmon <sblack...@apache.org> wrote:  
> >  
> >  
> >  
> > >  
> >  
> > >  
> >  
> > > TL;DR I’ve found a way to dramatically reduce barriers to using streams  
> > as  
> >  
> > > a beginner.  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > Using the streams 0.3 release, it’s quite a headache for a novice to  
> use  
> >  
> > > streams. We have a tutorial on the website, but it’s quite a journey.  
> You  
> >  
> > > have to check out all three repos and install them each in order before  
> > you  
> >  
> > > get a jar file you could use to get data, then you can run a few  
> > pre-canned  
> >  
> > > streams, and those are intermediate not beginner level.  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > In an ideal world, anyone would be able to yum or apt-get (or docker  
> > pull)  
> >  
> > > individual providers or processors and run them on their own without  
> >  
> > > building from source or composing them into multi-step streams.  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > We'd have increase our build and compliance complexity significantly to  
> >  
> > > publish official binaries. So what can we do to drop the learning curve  
> >  
> > > precipitously without doing that?  
> >  
> > >  
> >  
> >  
> >  
> > Some other projects are currently looking at publishing docker containers  
> >  
> > that people can easily extend. I am totally in favor of this approach.  
> >  
> >  
> >  
> >  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > Providers are really simple to run. The hard part is getting all of the  
> >  
> > > right classes and configuration properties into a JVM. Inspired by how  
> >  
> > > zeppelin’s %dep interpreter reduces the friction in composing and  
> > running a  
> >  
> > > scala notebook, I wanted to find a way to get the same ability from a  
> > linux  
> >  
> > > shell.  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > The commands below go from just a java installation to flat files of  
> >  
> > > twitter data in just a few minutes.  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > I think until we have binary distributions, this is how our tutorials  
> >  
> > > should tell the world to get started with streams.  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > Thoughts?  
> >  
> > >  
> >  
> >  
> >  
> > I think even publishing this as a Docker file example on the website  
> would  
> >  
> > be a good start.  
> >  
> >  
> >  
> >  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > -----  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > # install sbtx  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > curl -s https://raw.githubusercontent.com/paulp/sbt-extras/master/sbt  
> >  
> >  
> > > /usr/bin/sbtx && chmod 0755 /usr/bin/sbtx  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > # create a workspace  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > mkdir twitter-test; cd twitter-test;  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > # supply a config file with credentials  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > cat > application.conf << EOF  
> >  
> > >  
> >  
> > > twitter {  
> >  
> > >  
> >  
> > > oauth {  
> >  
> > >  
> >  
> > > consumerKey = ""  
> >  
> > >  
> >  
> > > consumerSecret = ""  
> >  
> > >  
> >  
> > > accessToken = ""  
> >  
> > >  
> >  
> > > accessTokenSecret = ""  
> >  
> > >  
> >  
> > > }  
> >  
> > >  
> >  
> > > retrySleepMs = 5000  
> >  
> > >  
> >  
> > > retryMax = 250  
> >  
> > >  
> >  
> > > info = [  
> >  
> > >  
> >  
> > > 18055613  
> >  
> > >  
> >  
> > > ]  
> >  
> > >  
> >  
> > > }  
> >  
> > >  
> >  
> > > EOF  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > sbtx -210 -sbt-create  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > set resolvers += "Local Maven Repository" at  
> >  
> > > "file://"+Path.userHome.absolutePath+"/.m2/repository"  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > set libraryDependencies += "org.apache.streams" %  
> >  
> > > "streams-provider-twitter" % "0.4-incubating-SNAPSHOT"  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > set fork := true  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > run-main  
> >  
> > > org.apache.streams.twitter.provider.TwitterUserInformationProvider  
> >  
> > > application.conf users.txt  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > run-main org.apache.streams.twitter.provider.TwitterTimelineProvider  
> >  
> > > application.conf statuses.txt  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > set javaOptions += "-Dtwitter.endpoint=friends"  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > run-main org.apache.streams.twitter.provider.TwitterFollowingProvider  
> >  
> > > application.conf friends.txt  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > set javaOptions += "-Dtwitter.endpoint=followers"  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > exit  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > ls -l  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > > Steves-MacBook-Pro-3:twitter sblackmon$ ls -l  
> >  
> > >  
> >  
> > > -rw-r--r--@ 1 sblackmon staff 356 Oct 6 11:54 application.conf  
> >  
> > >  
> >  
> > > -rw-r--r-- 1 sblackmon staff 293780 Oct 6 13:42 followers.txt  
> >  
> > >  
> >  
> > > -rw-r--r-- 1 sblackmon staff 6260 Oct 6 13:43 friends.txt  
> >  
> > >  
> >  
> > > drwxr-xr-x 3 sblackmon staff 102 Oct 6 10:17 project  
> >  
> > >  
> >  
> > > -rw-r--r-- 1 sblackmon staff 3339460 Oct 6 13:43 statuses.txt  
> >  
> > >  
> >  
> > > drwxr-xr-x 6 sblackmon staff 204 Oct 6 10:19 target  
> >  
> > >  
> >  
> > > -rw-r--r-- 1 sblackmon staff 3321 Oct 6 13:43 users.txt  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> > >  
> >  
> >  
>  

Reply via email to