Re: TwitterUtils on Windows

2015-05-19 Thread Steve Loughran

> On 19 May 2015, at 03:08, Justin Pihony  wrote:
> 
> 
> 15/05/18 22:03:14 INFO Executor: Fetching
> http://192.168.56.1:49752/jars/twitter4j-media-support-3.0.3.jar with
> timestamp 1432000973058
> 15/05/18 22:03:14 INFO Utils: Fetching
> http://192.168.56.1:49752/jars/twitter4j-media-support-3.0.3.jar to
> C:\Users\Justin\AppData\Local\Temp\spark-4a37d3
> e9-34a2-40d4-b09b-6399931f527d\userFiles-65ee748e-4721-4e16-9fe6-65933651fec1\fetchFileTemp8970201232303518432.tmp
> 15/05/18 22:03:14 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> java.lang.NullPointerException
>at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
>at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
>at org.apache.hadoop.util.Shell.run(Shell.java:455)
>at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.ja
> va:715)
>at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
>at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
>at org.apache.spark.util.Utils$.fetchFile(Utils.scala:443)
>at

you're going to need to set up Hadoop on your system enough for to execute the 
chmod operation via the winutils.exe

one tactic: grab the hortonworks windows version, install it (including setting 
up HADOOP_HOME). You don't need to run any of the hadoop services, you just 
need the binaries in the right place.

other: 

1. grab the copy of the relevant binaries which I've stuck up online

https://github.com/steveloughran/clusterconfigs/tree/master/clusters/morzine/hadoop_home/bin
2. install to some directory hadoop/bin
3. set the env variable HADOOP_HOME to the hadoopp dir (not the bin one)
4. set PATH=%PATH%;%HADOOP_HOME%/bin

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: TwitterUtils on Windows

2015-05-19 Thread Akhil Das
Hi Justin,

Can you try with sbt, may be that will help.

-> Install sbt for windows
http://www.scala-sbt.org/0.13/tutorial/Installing-sbt-on-Windows.html

-> Create a lib directory in your project directory
-> Place these jars in it:
- spark-streaming-twitter_2.10-1.3.1.jar
- twitter4j-async-3.0.3.jar
- twitter4j-core-3.0.3.jar
- twitter4j-media-support-3.0.3.jar
- twitter4j-stream-3.0.3.jar

-> Create a build.sbt file and add these contents:

name := "twitterStream"

version := "1.0"

scalaVersion := "2.10.4"

libraryDependencies += "org.apache.spark" % "spark-streaming_2.10" % "1.3.1"

-> Create a TwitterStream.scala and add these contents:


import org.apache.spark.streaming.twitter._
import org.apache.spark.streaming._
import org.apache.spark.{SparkContext, SparkConf}


object TwitterStream {
  def main(args: Array[String]) {


  System.setProperty("twitter4j.oauth.consumerKey","*")
System.setProperty("twitter4j.oauth.consumerSecret","*")
System.setProperty("twitter4j.oauth.accessToken","*")
System.setProperty("twitter4j.oauth.accessTokenSecret","*")


val sconf = new SparkConf()
  .setMaster("local[*]")
  .setAppName("TwitterStream")

val sc = new SparkContext(sconf)

val ssc = new StreamingContext(sc, Seconds(10))
val stream = TwitterUtils.createStream(ssc, None)
 ssc.start()
ssc.awaitTermination()


  }
}


-> Now do a sbt run



Thanks
Best Regards

On Tue, May 19, 2015 at 9:56 AM, Justin Pihony 
wrote:

> I think I found the answer ->
> http://apache-spark-user-list.1001560.n3.nabble.com/Error-while-running-example-scala-application-using-spark-submit-td10056.html
>
> Do I have no way of running this in Windows locally?
>
>
> On Mon, May 18, 2015 at 10:44 PM, Justin Pihony 
> wrote:
>
>> I'm not 100% sure that is causing a problem, though. The stream still
>> starts, but is giving blank output. I checked the environment variables in
>> the ui and it is running local[*], so there should be no bottleneck there.
>>
>> On Mon, May 18, 2015 at 10:08 PM, Justin Pihony 
>> wrote:
>>
>>> I am trying to print a basic twitter stream and receiving the following
>>> error:
>>>
>>>
>>> 15/05/18 22:03:14 INFO Executor: Fetching
>>> http://192.168.56.1:49752/jars/twitter4j-media-support-3.0.3.jar with
>>> timestamp 1432000973058
>>> 15/05/18 22:03:14 INFO Utils: Fetching
>>> http://192.168.56.1:49752/jars/twitter4j-media-support-3.0.3.jar to
>>> C:\Users\Justin\AppData\Local\Temp\spark-4a37d3
>>>
>>> e9-34a2-40d4-b09b-6399931f527d\userFiles-65ee748e-4721-4e16-9fe6-65933651fec1\fetchFileTemp8970201232303518432.tmp
>>> 15/05/18 22:03:14 ERROR Executor: Exception in task 0.0 in stage 0.0
>>> (TID 0)
>>> java.lang.NullPointerException
>>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
>>> at org.apache.hadoop.util.Shell.run(Shell.java:455)
>>> at
>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.ja
>>> va:715)
>>> at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
>>> at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
>>> at org.apache.spark.util.Utils$.fetchFile(Utils.scala:443)
>>> at
>>>
>>> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:374)
>>> at
>>>
>>> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:366)
>>> at
>>>
>>> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>>> at
>>>
>>> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>>> at
>>>
>>> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>>> at
>>>
>>> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
>>> at
>>> scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
>>> at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
>>> at
>>>
>>> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>>> at
>>> org.apache.spark.executor.Executor.org
>>> $apache$spark$executor$Executor$$updateDependencies(Executor.scala:366)
>>> at
>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:184)
>>> at
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>> at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>> at java.lang.Thread.run(Thread.java:744)
>>>
>>>
>>> Code is:
>>>
>>> spark-shell --jars
>>>
>>> \Spark\lib\spark-streaming-twitter_2.10-1.3.1.jar,\Spark\lib\twitter4j-async-3.0.3.jar,\Spark\lib\twitter4j-core-3.0.3.jar,\Spark\lib\twitter4j-media-support-3.0.3.jar,\Spark\lib\twitter4j-stream-3.0.3.jar
>>>
>>> import org.apache.spark.streaming.twitter._
>>> imp

Re: TwitterUtils on Windows

2015-05-18 Thread Justin Pihony
I think I found the answer ->
http://apache-spark-user-list.1001560.n3.nabble.com/Error-while-running-example-scala-application-using-spark-submit-td10056.html

Do I have no way of running this in Windows locally?


On Mon, May 18, 2015 at 10:44 PM, Justin Pihony 
wrote:

> I'm not 100% sure that is causing a problem, though. The stream still
> starts, but is giving blank output. I checked the environment variables in
> the ui and it is running local[*], so there should be no bottleneck there.
>
> On Mon, May 18, 2015 at 10:08 PM, Justin Pihony 
> wrote:
>
>> I am trying to print a basic twitter stream and receiving the following
>> error:
>>
>>
>> 15/05/18 22:03:14 INFO Executor: Fetching
>> http://192.168.56.1:49752/jars/twitter4j-media-support-3.0.3.jar with
>> timestamp 1432000973058
>> 15/05/18 22:03:14 INFO Utils: Fetching
>> http://192.168.56.1:49752/jars/twitter4j-media-support-3.0.3.jar to
>> C:\Users\Justin\AppData\Local\Temp\spark-4a37d3
>>
>> e9-34a2-40d4-b09b-6399931f527d\userFiles-65ee748e-4721-4e16-9fe6-65933651fec1\fetchFileTemp8970201232303518432.tmp
>> 15/05/18 22:03:14 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID
>> 0)
>> java.lang.NullPointerException
>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
>> at org.apache.hadoop.util.Shell.run(Shell.java:455)
>> at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.ja
>> va:715)
>> at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
>> at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
>> at org.apache.spark.util.Utils$.fetchFile(Utils.scala:443)
>> at
>>
>> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:374)
>> at
>>
>> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:366)
>> at
>>
>> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>> at
>>
>> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>> at
>>
>> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>> at
>> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
>> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
>> at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
>> at
>>
>> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>> at
>> org.apache.spark.executor.Executor.org
>> $apache$spark$executor$Executor$$updateDependencies(Executor.scala:366)
>> at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:184)
>> at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:744)
>>
>>
>> Code is:
>>
>> spark-shell --jars
>>
>> \Spark\lib\spark-streaming-twitter_2.10-1.3.1.jar,\Spark\lib\twitter4j-async-3.0.3.jar,\Spark\lib\twitter4j-core-3.0.3.jar,\Spark\lib\twitter4j-media-support-3.0.3.jar,\Spark\lib\twitter4j-stream-3.0.3.jar
>>
>> import org.apache.spark.streaming.twitter._
>> import org.apache.spark.streaming._
>>
>> System.setProperty("twitter4j.oauth.consumerKey","*")
>> System.setProperty("twitter4j.oauth.consumerSecret","*")
>> System.setProperty("twitter4j.oauth.accessToken","*")
>> System.setProperty("twitter4j.oauth.accessTokenSecret","*")
>>
>> val ssc = new StreamingContext(sc, Seconds(10))
>> val stream = TwitterUtils.createStream(ssc, None)
>> stream.print
>> ssc.start
>>
>>
>> This seems to be happening at FileUtil.chmod(targetFile.getAbsolutePath,
>> "a+x") but Im not sure why...
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/TwitterUtils-on-Windows-tp22939.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Re: TwitterUtils on Windows

2015-05-18 Thread Justin Pihony
I'm not 100% sure that is causing a problem, though. The stream still
starts, but is giving blank output. I checked the environment variables in
the ui and it is running local[*], so there should be no bottleneck there.

On Mon, May 18, 2015 at 10:08 PM, Justin Pihony 
wrote:

> I am trying to print a basic twitter stream and receiving the following
> error:
>
>
> 15/05/18 22:03:14 INFO Executor: Fetching
> http://192.168.56.1:49752/jars/twitter4j-media-support-3.0.3.jar with
> timestamp 1432000973058
> 15/05/18 22:03:14 INFO Utils: Fetching
> http://192.168.56.1:49752/jars/twitter4j-media-support-3.0.3.jar to
> C:\Users\Justin\AppData\Local\Temp\spark-4a37d3
>
> e9-34a2-40d4-b09b-6399931f527d\userFiles-65ee748e-4721-4e16-9fe6-65933651fec1\fetchFileTemp8970201232303518432.tmp
> 15/05/18 22:03:14 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID
> 0)
> java.lang.NullPointerException
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.ja
> va:715)
> at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
> at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
> at org.apache.spark.util.Utils$.fetchFile(Utils.scala:443)
> at
>
> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:374)
> at
>
> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:366)
> at
>
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
> at
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
> at
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
> at
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
> at
>
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
> at
> org.apache.spark.executor.Executor.org
> $apache$spark$executor$Executor$$updateDependencies(Executor.scala:366)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:184)
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:744)
>
>
> Code is:
>
> spark-shell --jars
>
> \Spark\lib\spark-streaming-twitter_2.10-1.3.1.jar,\Spark\lib\twitter4j-async-3.0.3.jar,\Spark\lib\twitter4j-core-3.0.3.jar,\Spark\lib\twitter4j-media-support-3.0.3.jar,\Spark\lib\twitter4j-stream-3.0.3.jar
>
> import org.apache.spark.streaming.twitter._
> import org.apache.spark.streaming._
>
> System.setProperty("twitter4j.oauth.consumerKey","*")
> System.setProperty("twitter4j.oauth.consumerSecret","*")
> System.setProperty("twitter4j.oauth.accessToken","*")
> System.setProperty("twitter4j.oauth.accessTokenSecret","*")
>
> val ssc = new StreamingContext(sc, Seconds(10))
> val stream = TwitterUtils.createStream(ssc, None)
> stream.print
> ssc.start
>
>
> This seems to be happening at FileUtil.chmod(targetFile.getAbsolutePath,
> "a+x") but Im not sure why...
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/TwitterUtils-on-Windows-tp22939.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>