Re: Serialization of internal job vals

2016-09-22 Thread 'Oscar Boykin' via Scalding Development
It uses this code:

https://github.com/twitter/chill/blob/develop/chill-scala/src/main/scala/com/twitter/chill/Externalizer.scala

which uses Kryo and Java Serialization to make the best effort to serialize
functions.

On Thu, Sep 22, 2016 at 9:04 AM Kostya Salomatin 
wrote:

> Hey folks,
>
> I've got a question about serialization of internal Job vals (not the
> values passed in the pipe). Consider this simple example:
>
> class MyJob(args: Args) extends Job(args) {
>val myFilter = new MyFilterClass(...)
>
>val pipe = ... //read statement
>   .filter (myFilter.apply)
> }
>
> class MyFilterClass(some data required for filtering) { ... }
>
> My understanding is that myFilter will only be initialized once and then
> serialized and passed to all subsequent workers and if scalding does not
> know how to serialize this object the job will fail, is this correct?
> MyFilterClass is a regular scala class (not a case class) and I did not do
> anything special to make it serializable and the job runs fine. Does
> scalding know by default how to serialize all scala objects without any
> effort from me in the code or does it silently call constructor every time?
>
> Thanks,
> Kostya
>
> --
> You received this message because you are subscribed to the Google Groups
> "Scalding Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to scalding-dev+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Scalding Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scalding-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Serialization of internal job vals

2016-09-22 Thread Kostya Salomatin
Hey folks,

I've got a question about serialization of internal Job vals (not the 
values passed in the pipe). Consider this simple example:

class MyJob(args: Args) extends Job(args) {
   val myFilter = new MyFilterClass(...)

   val pipe = ... //read statement
  .filter (myFilter.apply)
}

class MyFilterClass(some data required for filtering) { ... }

My understanding is that myFilter will only be initialized once and then 
serialized and passed to all subsequent workers and if scalding does not 
know how to serialize this object the job will fail, is this correct? 
MyFilterClass is a regular scala class (not a case class) and I did not do 
anything special to make it serializable and the job runs fine. Does 
scalding know by default how to serialize all scala objects without any 
effort from me in the code or does it silently call constructor every time?

Thanks,
Kostya

-- 
You received this message because you are subscribed to the Google Groups 
"Scalding Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scalding-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.