I considered adding this to DataSource APIV2 ticket but I didn't want to be
first :P Do you think there will be any issues with opening up the
partitioning as well?

On Fri, Jun 16, 2017 at 11:58 AM Reynold Xin <r...@databricks.com> wrote:

> Perhaps we should extend the data source API to support that.
>
>
> On Fri, Jun 16, 2017 at 11:37 AM, Russell Spitzer <
> russell.spit...@gmail.com> wrote:
>
>> I've been trying to work with making Catalyst Cassandra partitioning
>> aware. There seem to be two major blocks on this.
>>
>> The first is that DataSourceScanExec is unable to learn what the
>> underlying partitioning should be from the BaseRelation it comes from. I'm
>> currently able to get around this by using the DataSourceStrategy plan and
>> then transforming the resultant DataSourceScanExec.
>>
>> The second is that the Partitioning trait is sealed. I want to define a
>> new partitioning which is Clustered but is not hashed based on certain
>> columns. It would look almost identical to the HashPartitioning class
>> except the
>> expression which returns a valid PartitionID given expressions would be
>> different.
>>
>> Anyone have any ideas on how to get around the second issue? Would it be
>> worth while to make changes to allow BaseRelations to advertise a
>> particular Partitioner?
>>
>
>

Reply via email to