Re: Error reporting/collecting for users

2015-01-28 Thread Tathagata Das
You could use foreachRDD to do the operations and then inside the foreach create an accumulator to gather all the errors together dstream.foreachRDD { rdd = val accumulator = new Accumulator[] rdd.map { . }.count // whatever operation that is error prone // gather all errors

Error reporting/collecting for users

2015-01-27 Thread Tobias Pfeiffer
Hi, in my Spark Streaming application, computations depend on users' input in terms of * user-defined functions * computation rules * etc. that can throw exceptions in various cases (think: exception in UDF, division by zero, invalid access by key etc.). Now I am wondering about what is a

Re: Error reporting/collecting for users

2015-01-27 Thread Soumitra Kumar
It is a Streaming application, so how/when do you plan to access the accumulator on driver? On Tue, Jan 27, 2015 at 6:48 PM, Tobias Pfeiffer t...@preferred.jp wrote: Hi, thanks for your mail! On Wed, Jan 28, 2015 at 11:44 AM, Tathagata Das tathagata.das1...@gmail.com wrote: That seems

Re: Error reporting/collecting for users

2015-01-27 Thread Tobias Pfeiffer
Hi, On Wed, Jan 28, 2015 at 1:45 PM, Soumitra Kumar kumar.soumi...@gmail.com wrote: It is a Streaming application, so how/when do you plan to access the accumulator on driver? Well... maybe there would be some user command or web interface showing the errors that have happened during

Re: Error reporting/collecting for users

2015-01-27 Thread Tobias Pfeiffer
Hi, thanks for your mail! On Wed, Jan 28, 2015 at 11:44 AM, Tathagata Das tathagata.das1...@gmail.com wrote: That seems reasonable to me. Are you having any problems doing it this way? Well, actually I haven't done that yet. The idea of using accumulators to collect errors just came while