Hi Wang, Seems like its a defect, are you planning to raise a defect ? if not I can raise and fix ....
Regards, Naga Huawei Technologies Co., Ltd. Phone: Fax: Mobile: +91 9980040283 Email: [email protected]<mailto:[email protected]> Huawei Technologies Co., Ltd. Bantian, Longgang District,Shenzhen 518129, P.R.China http://www.huawei.com ________________________________ From: Benyi Wang [[email protected]] Sent: Wednesday, September 17, 2014 06:37 To: [email protected]; [email protected] Subject: Is it a bug in CombineFileSplit? I use Spark's SerializableWritable to wrap CombineFileSplit so I can pass around the splits. But I ran into Serialization issues. In researching why my code fails, I found that this might be a bug in CombineFileSplit: CombineFileSplit doesn't serialize locations in write(DataOutput out) and deserialize locations in readFields(DataInput in). When I create a split in CombineFileInputFormat, locations is an array of String[0], but after deserialization (default contructor, then readFields), the locations will be null. This will lead NPE.
