Re: How do you perform blocking IO in apache spark job?

2014-09-08 Thread Jörn Franke
Hi, I What does the external service provide? Data? Calculations? Can the service push data to you via Kafka and Spark streaming ? Can you fetch the necessary data beforehand from the service? The solution to your question depends on your answers. I would not recommend to connect to a blocking

Re: How do you perform blocking IO in apache spark job?

2014-09-08 Thread Jörn Franke
Hi, So the external service itself creates threads and blocks until they finished execution? In this case you should not do threading but include it via jni directly in spark - it will take care about threading for you. Vest regards Hi, Jörn, first of all, thanks for you intent to help. This

Re: Printing the RDDs in SparkPageRank

2014-08-24 Thread Jörn Franke
Hi, What kind of error do you receive? Best regards, Jörn Le 24 août 2014 08:29, Deep Pradhan pradhandeep1...@gmail.com a écrit : Hi, I was going through the SparkPageRank code and want to see the intermediate steps, like the RDDs formed in the intermediate steps. Here is a part of the

Re: heterogeneous cluster hardware

2014-08-21 Thread Jörn Franke
Hi, Well, you could use Mesos or Yarn2 to define resources per Job - you can give only as much resources (cores, memory etc.) per machine as your worst machine has. The rest is done by Mesos or Yarn. By doing this you avoid a per machine resource assignment without any disadvantages. You can run

Re: heterogeneous cluster hardware

2014-08-21 Thread Jörn Franke
? this would seem to be the correct point of abstraction that would allow the construction of massive clusters using on-hand hardware? (the scheduler probably wouldn't have to change at all) On Thu, Aug 21, 2014 at 9:25 AM, Jörn Franke [via Apache Spark User List] [hidden email] http://user

Re: SparkStreaming 0.9.0 / Java / Twitter issue

2014-08-17 Thread Jörn Franke
10, 2014 at 11:25 PM, Jörn Franke jornfra...@gmail.com wrote: Hallo, Out of curiosity, I try to implement the following example in Java according to the following site: http://ampcamp.berkeley.edu/3/exercises/realtime-processing-with-spark-streaming.html Unfortunately, I did not find

Re: spark streaming - lamda architecture

2014-08-16 Thread Jörn Franke
Hi, Maybe this helps you. For the speed layer I think something like complex event processing as it is - to some extent - supported by Spark Streaming can make sense. You process the events as they come in. You store them afterwards. The Spark Streaming web page gives a nice example: trend

SparkStreaming 0.9.0 / Java / Twitter issue

2014-08-10 Thread Jörn Franke
Hallo, Out of curiosity, I try to implement the following example in Java according to the following site: http://ampcamp.berkeley.edu/3/exercises/realtime-processing-with-spark-streaming.html Unfortunately, I did not find a recent example for using a Twitter source in Spark Streaming with Java.

Re: Streaming on different store types

2014-07-30 Thread Jörn Franke
Hallo, I fear you have to write your own transaction logic for it (coordination,e. .g. via Zookeeper, transaction log, depending on your requirements raft /paxos etc.). However, before you embark on this journey question yourself if your application really needs it and what data load you expect.

<    1   2   3   4   5   6