You can use mapPartitions to do that.

On Friday, August 14, 2015, 周千昊 <qhz...@apache.org> wrote:

> I am thinking that creating a shared object outside the closure, use this
> object to hold the byte array.
> will this work?
>
> 周千昊 <qhz...@apache.org 
> <javascript:_e(%7B%7D,'cvml','qhz...@apache.org');>>于2015年8月14日周五
> 下午4:02写道:
>
>> Hi,
>>     All I want to do is that,
>>     1. read from some source
>>     2. do some calculation to get some byte array
>>     3. write the byte array to hdfs
>>     In hadoop, I can share an ImmutableByteWritable, and do some
>> System.arrayCopy, it will prevent the application from creating a lot of
>> small objects which will improve the gc latency.
>>     *However I was wondering if there is any solution like above in
>> spark that can avoid creating small objects*
>>
>

Reply via email to