Hi,

I'm trying to write a Crunch job to generate a large amount of simulated
data.  To kick the job off, I need inputs into a do function.  These inputs
are essentially dummy values that will be ignored in the do fn.  To
accomplish this, I'd like to create an inmemory PCollection that can then
be passed into a MR pipeline, but if I do this with MemPipeline.collectionOf
I get an error:

Exception in thread "main" java.lang.IllegalStateException:  named
'null' cannot be serialized
        at 
org.apache.crunch.impl.mem.collect.MemCollection.verifySerializable(MemCollection.java:110)
        at 
org.apache.crunch.impl.mem.collect.MemCollection.parallelDo(MemCollection.java:129)

Is it possible to explicitly declare/instantiate a PCollection to pass
into an MRPipeline?

Thanks!

-Ben

Reply via email to