Hi James,
Thanks for your reply. I don't understand the issue fully - do HBase's
Bytes.toBytes() methods not have the same sort order as that of Phoenix? I'd
really appreciate it if you could give more insight on this. Their
documentation doesn't mention the sort order. If negative numbers sort ahead of
the positive numbers, why is that incompatible with Phoenix?
It's interesting because it seems that we can't have columns in Phoenix
views/tables where the values have (negative) long values. In my setup, it is
not feasible to create a new Phoenix table and copy over the data because the
table is very large and we'd need to recreate the Phoenix table with updated
data every time we want to run queries.
Is it feasible to write a UDF (for a SELECT statement) that converts the
bytearray in that column to a long value? If it is, would I use the Tuple
object that's passed in to the evaluate() method to get the values in that
column? I tried using Tuple's getValue() method to grab the bytearray in the
column, but I'm running into issues. I'm looking at ToNumberFunction for
reference.
I really appreciate your help.
- Anchal
On Wednesday, July 29, 2015 8:43 AM, James Taylor <[email protected]>
wrote:
Hi Anchal,Phoenix depends on the sort order of the serialized bytes to match
the natural sort order of the column value. The HBase Bytes.toBytes() methods
do not meet this requirement, as negative numbers will sort ahead of positive
numbers. About the only option you have in this case is to create a new Phoenix
table and copy the data over from your old table. If the data is being created
by some external process, then you'd need to change it to use the PDataType
toBytes() method instead of the HBase Bytes.toBytes() method.
It's possible that Phoenix could relax this constraint for columns that are not
part of the primary key constraint - please file a JIRA for this. We'd need to
define a new PDataType (it could share almost all of it's implementation with
PUnsignedLong) and handle ORDER BY differently for these types.
Thanks,James
On Tue, Jul 28, 2015 at 10:40 PM, Anchal Agrawal <[email protected]> wrote:
Hi,
I'm creating a Phoenix view of an existing HBase table on v4.4.0.
Command: CREATE VIEW "table_name" (pk VARBINARY PRIMARY KEY, "cf"."col"
DATA_TYPE_HERE);
The col column has long values that are serialized by Bytes.toBytes(long) but
since some values are negative, I can't use UNSIGNED_LONG. I tried BIGINT
instead since the documentation says that it maps to java.lang.Long, but that
resulted in incorrect column values. The datatype documentation for
UNSIGNED_LONG says "use the regular signed type instead" - which datatype is
this referring to? LONG isn't supported.
I could create the view with the column values as bytearrays and write a UDF to
extract long values, but I think that will add to the latency. Is there a way
around this? I really appreciate your help.
Sincerely,Anchal Agrawal