Re: Reverse sort on Primary Key

2018-04-24 Thread Dan Burkert
The comparator for ScanTokens is based on the partition key, which isn't
(necessarily) the same as primary key order.  If you have any kind of hash
partitioning you will need to do a merge, and if the range partitioning
doesn't match the PK then it will be completely different.  So, if you set
up the table's partitioning very carefully you could do the reverse sorting
tablet-by-tablet instead of globally.

- Dan


On Mon, Apr 23, 2018 at 9:32 PM, Scott Reynolds <sdrreyno...@gmail.com>
wrote:

> Thanks guys thought it should be easy. I am looking to paginate through
> the records, not return the most recent value.
>
> If I need to toggle -- which I might -- I was planning on using token api.
> The KuduScanToken actually implements Comparable and it's implementation is
> around the tablet. I think this means you can sort the list of
> KuduScanTokens and call out to each one sequentially. Does that sound crazy?
>
> On Mon, Apr 23, 2018 at 3:23 PM Dan Burkert <danburk...@apache.org> wrote:
>
>> Hey Scott,
>>
>> Patrick's answer is spot on.  I'm curious, though, is your usecase to
>> find the latest value?  Effectively a 'SORT BY DESC date LIMIT 1', or are
>> you looking for the last n values, or all values?  I ask because we
>> frequently get the 'last value' question, and the solution for that might
>> be more specific (and simpler) than a generalized reverse sort + limit.
>>
>> - Dan
>>
>> On Mon, Apr 23, 2018 at 1:25 PM, Patrick Angeles <patr...@cloudera.com>
>> wrote:
>>
>>> The common technique is to use (MAX_LONG - timestamp). Unfortunately
>>> this won't let you toggle the sort order back-and-forth on the same table.
>>> You could have a duplicate table with the inverse key, effectively using it
>>> as a secondary index.
>>>
>>> As of version 0.98, HBase supports a reverse scan without a 'secondary
>>> index' table (HBASE-4811), so with a bit of work Kudu may be able to
>>> provide something similar.
>>>
>>>
>>> Patrick Angeles
>>> Chief Architect Financial Services
>>> 151 West 26th Street Suite 1002 |
>>> <https://maps.google.com/?q=151+West+26th+Street+Suite+1002+%7C%C2%A0+New+York,+NY+10001=gmail=g>New
>>> York, NY 10001
>>> <https://maps.google.com/?q=151+West+26th+Street+Suite+1002+%7C%C2%A0+New+York,+NY+10001=gmail=g>
>>> +1 (917) 633-4524 <(917)%20633-4524>
>>>
>>> On Sat, Apr 21, 2018 at 10:36 PM, Scott Reynolds <sdrreyno...@gmail.com>
>>> wrote:
>>>
>>>> Today we are using Kudu  to store timeseries information and would like
>>>> the ability to toggle the sort direction. It is unclear to me at the moment
>>>> how to achieve this efficiently. I naively assumed Kudu could read the
>>>> primary key in reverse but there doesn't appear to be the case ATM.
>>>>
>>>> If you were tasked with implementing a reverse sort on the primary key
>>>> (Date Desc) how would you go about implementing it ?
>>>>
>>>> Thanks!
>>>>
>>>
>>>
>>


Re: Reverse sort on Primary Key

2018-04-23 Thread Scott Reynolds
Thanks guys thought it should be easy. I am looking to paginate through the
records, not return the most recent value.

If I need to toggle -- which I might -- I was planning on using token api.
The KuduScanToken actually implements Comparable and it's implementation is
around the tablet. I think this means you can sort the list of
KuduScanTokens and call out to each one sequentially. Does that sound crazy?

On Mon, Apr 23, 2018 at 3:23 PM Dan Burkert <danburk...@apache.org> wrote:

> Hey Scott,
>
> Patrick's answer is spot on.  I'm curious, though, is your usecase to find
> the latest value?  Effectively a 'SORT BY DESC date LIMIT 1', or are you
> looking for the last n values, or all values?  I ask because we frequently
> get the 'last value' question, and the solution for that might be more
> specific (and simpler) than a generalized reverse sort + limit.
>
> - Dan
>
> On Mon, Apr 23, 2018 at 1:25 PM, Patrick Angeles <patr...@cloudera.com>
> wrote:
>
>> The common technique is to use (MAX_LONG - timestamp). Unfortunately this
>> won't let you toggle the sort order back-and-forth on the same table. You
>> could have a duplicate table with the inverse key, effectively using it as
>> a secondary index.
>>
>> As of version 0.98, HBase supports a reverse scan without a 'secondary
>> index' table (HBASE-4811), so with a bit of work Kudu may be able to
>> provide something similar.
>>
>>
>> Patrick Angeles
>> Chief Architect Financial Services
>> 151 West 26th Street Suite 1002 |
>> <https://maps.google.com/?q=151+West+26th+Street+Suite+1002+%7C%C2%A0+New+York,+NY+10001=gmail=g>New
>> York, NY 10001
>> <https://maps.google.com/?q=151+West+26th+Street+Suite+1002+%7C%C2%A0+New+York,+NY+10001=gmail=g>
>> +1 (917) 633-4524 <(917)%20633-4524>
>>
>> On Sat, Apr 21, 2018 at 10:36 PM, Scott Reynolds <sdrreyno...@gmail.com>
>> wrote:
>>
>>> Today we are using Kudu  to store timeseries information and would like
>>> the ability to toggle the sort direction. It is unclear to me at the moment
>>> how to achieve this efficiently. I naively assumed Kudu could read the
>>> primary key in reverse but there doesn't appear to be the case ATM.
>>>
>>> If you were tasked with implementing a reverse sort on the primary key
>>> (Date Desc) how would you go about implementing it ?
>>>
>>> Thanks!
>>>
>>
>>
>


Re: Reverse sort on Primary Key

2018-04-23 Thread Dan Burkert
Hey Scott,

Patrick's answer is spot on.  I'm curious, though, is your usecase to find
the latest value?  Effectively a 'SORT BY DESC date LIMIT 1', or are you
looking for the last n values, or all values?  I ask because we frequently
get the 'last value' question, and the solution for that might be more
specific (and simpler) than a generalized reverse sort + limit.

- Dan

On Mon, Apr 23, 2018 at 1:25 PM, Patrick Angeles <patr...@cloudera.com>
wrote:

> The common technique is to use (MAX_LONG - timestamp). Unfortunately this
> won't let you toggle the sort order back-and-forth on the same table. You
> could have a duplicate table with the inverse key, effectively using it as
> a secondary index.
>
> As of version 0.98, HBase supports a reverse scan without a 'secondary
> index' table (HBASE-4811), so with a bit of work Kudu may be able to
> provide something similar.
>
>
> Patrick Angeles
> Chief Architect Financial Services
> 151 West 26th Street Suite 1002 |
> <https://maps.google.com/?q=151+West+26th+Street+Suite+1002+%7C%C2%A0+New+York,+NY+10001=gmail=g>New
> York, NY 10001
> <https://maps.google.com/?q=151+West+26th+Street+Suite+1002+%7C%C2%A0+New+York,+NY+10001=gmail=g>
> +1 (917) 633-4524
>
> On Sat, Apr 21, 2018 at 10:36 PM, Scott Reynolds <sdrreyno...@gmail.com>
> wrote:
>
>> Today we are using Kudu  to store timeseries information and would like
>> the ability to toggle the sort direction. It is unclear to me at the moment
>> how to achieve this efficiently. I naively assumed Kudu could read the
>> primary key in reverse but there doesn't appear to be the case ATM.
>>
>> If you were tasked with implementing a reverse sort on the primary key
>> (Date Desc) how would you go about implementing it ?
>>
>> Thanks!
>>
>
>


Re: Reverse sort on Primary Key

2018-04-23 Thread Patrick Angeles
The common technique is to use (MAX_LONG - timestamp). Unfortunately this
won't let you toggle the sort order back-and-forth on the same table. You
could have a duplicate table with the inverse key, effectively using it as
a secondary index.

As of version 0.98, HBase supports a reverse scan without a 'secondary
index' table (HBASE-4811), so with a bit of work Kudu may be able to
provide something similar.


Patrick Angeles
Chief Architect Financial Services
151 West 26th Street Suite 1002 | New York, NY 10001
+1 (917) 633-4524

On Sat, Apr 21, 2018 at 10:36 PM, Scott Reynolds <sdrreyno...@gmail.com>
wrote:

> Today we are using Kudu  to store timeseries information and would like
> the ability to toggle the sort direction. It is unclear to me at the moment
> how to achieve this efficiently. I naively assumed Kudu could read the
> primary key in reverse but there doesn't appear to be the case ATM.
>
> If you were tasked with implementing a reverse sort on the primary key
> (Date Desc) how would you go about implementing it ?
>
> Thanks!
>


Reverse sort on Primary Key

2018-04-21 Thread Scott Reynolds
Today we are using Kudu  to store timeseries information and would like the
ability to toggle the sort direction. It is unclear to me at the moment how
to achieve this efficiently. I naively assumed Kudu could read the primary
key in reverse but there doesn't appear to be the case ATM.

If you were tasked with implementing a reverse sort on the primary key
(Date Desc) how would you go about implementing it ?

Thanks!