Logging inside a map function shouldn't "freeze things." The messages should be logged on the worker logs, since the code is executed on the executors. If you throw a SparkException, however, it'll be propagated to the driver after it has failed 4 or more times (by default).
On Fri, Apr 4, 2014 at 11:57 AM, John Salvatier <jsalvat...@gmail.com>wrote: > Btw, thank you for your help. > > > On Fri, Apr 4, 2014 at 11:49 AM, John Salvatier <jsalvat...@gmail.com>wrote: > >> Is there a way to log exceptions inside a mapping function? logError and >> logInfo seem to freeze things. >> >> >> On Fri, Apr 4, 2014 at 11:02 AM, Matei Zaharia >> <matei.zaha...@gmail.com>wrote: >> >>> Exceptions should be sent back to the driver program and logged there >>> (with a SparkException thrown if a task fails more than 4 times), but there >>> were some bugs before where this did not happen for non-Serializable >>> exceptions. We changed it to pass back the stack traces only (as text), >>> which should always work. I'd recommend trying a newer Spark version, 0.8 >>> should be easy to upgrade to from 0.7. >>> >>> Matei >>> >>> On Apr 4, 2014, at 10:40 AM, John Salvatier <jsalvat...@gmail.com> >>> wrote: >>> >>> > I'm trying to get a clear idea about how exceptions are handled in >>> Spark? Is there somewhere where I can read about this? I'm on spark .7 >>> > >>> > For some reason I was under the impression that such exceptions are >>> swallowed and the value that produced them ignored but the exception is >>> logged. However, right now we're seeing the task just re-tried over and >>> over again in an infinite loop because there's a value that always >>> generates an exception. >>> > >>> > John >>> >>> >> >