I think you’re right, we do offer the opportunity for developers to make mistakes while implementing the new Data Source.
Here we assume that the new relation MUST NOT extends more than one trait of the CatalystScan, TableScan, PrunedScan, PrunedFilteredScan , etc. otherwise it will causes problem as you described, probably we can add additional checking / reporting rule for the abuse. From: Jeff Zhang [mailto:[email protected]] Sent: Thursday, November 5, 2015 1:55 PM To: Cheng, Hao Cc: [email protected] Subject: Re: Why LibSVMRelation and CsvRelation don't extends HadoopFsRelation ? Thanks Hao. I have ready made it extends HadoopFsRelation and it works. Will create a jira for that. Besides that, I noticed that in DataSourceStrategy, spark build physical plan based on the trait of the BaseRelation in pattern matching (e.g. CatalystScan, TableScan, HadoopFsRelation). That means the order matters. I think it is risky because that means one BaseRelation can't extends more than 2 of these traits. And seems there's no place to restrict to extends more than 2 traits. Maybe needs to clean and reorganize these traits otherwise user may meets some weird issue when developing new DataSource. On Thu, Nov 5, 2015 at 1:16 PM, Cheng, Hao <[email protected]<mailto:[email protected]>> wrote: Probably 2 reasons: 1. HadoopFsRelation was introduced since 1.4, but seems CsvRelation was created based on 1.3 2. HadoopFsRelation introduces the concept of Partition, which probably not necessary for LibSVMRelation. But I think it will be easy to change as extending from HadoopFsRelation. Hao From: Jeff Zhang [mailto:[email protected]<mailto:[email protected]>] Sent: Thursday, November 5, 2015 10:31 AM To: [email protected]<mailto:[email protected]> Subject: Why LibSVMRelation and CsvRelation don't extends HadoopFsRelation ? Not sure the reason, it seems LibSVMRelation and CsvRelation can extends HadoopFsRelation and leverage the features from HadoopFsRelation. Any other consideration for that ? -- Best Regards Jeff Zhang -- Best Regards Jeff Zhang
