|
Actually, the interesting part in
hadoop files is the sequencefile format which allows to split the
data in various blocks. Other files in HDFS are single-blocks.
They do not scale
An ObjectFile cannot be naturally splitted. Usually, in Hadoop when storing a sequence of elements instead of a sequence of key,value the trick is to store key,null I don't know what's the most effective way to do that in scala/spark. Actually that would be a good thing to add it to RDD[U] Guillaume
--
|
- Re: Turning kryo on does not decrease binary output Guillaume Pitel
- Re: Turning kryo on does not decrease binary output Aureliano Buendia
- Re: Turning kryo on does not decrease binary ou... Andrew Ash
- Re: Turning kryo on does not decrease binar... Aureliano Buendia
- Re: Turning kryo on does not decrease binary ou... Guillaume Pitel
- Re: Turning kryo on does not decrease binar... Aureliano Buendia
- Re: Turning kryo on does not decrease b... Guillaume Pitel
- Re: Turning kryo on does not decre... Aureliano Buendia
- Re: Turning kryo on does not d... Andrew Ash
- Re: Turning kryo on does not d... Aureliano Buendia
- Re: Turning kryo on does not d... Guillaume Pitel
- Re: Turning kryo on does not d... Aureliano Buendia
- Re: Turning kryo on does not d... Guillaume Pitel
- Re: Turning kryo on does not d... Imran Rashid
- Re: Turning kryo on does not d... Aureliano Buendia
- Re: Turning kryo on does not d... Guillaume Pitel
- Re: Turning kryo on does not d... Aureliano Buendia
- Re: Turning kryo on does not d... Aureliano Buendia

