You have to remember that HBase doesn't enforce any sort of typing. That's why this can be difficult.
You'd have to write a coprocessor to enforce a schema on a table. Even then YMMV if you're writing JSON structures to a column because while the contents of the structures could be the same, the actual strings could differ. HTH -Mike On Jun 27, 2013, at 4:41 PM, Kristoffer Sjögren <[email protected]> wrote: > I realize standard comparators cannot solve this. > > However I do know the type of each column so writing custom list > comparators for boolean, char, byte, short, int, long, float, double seems > quite straightforward. > > Long arrays, for example, are stored as a byte array with 8 bytes per item > so a comparator might look like this. > > public class LongsComparator extends WritableByteArrayComparable { > public int compareTo(byte[] value, int offset, int length) { > long[] values = BytesUtils.toLongs(value, offset, length); > for (long longValue : values) { > if (longValue == val) { > return 0; > } > } > return 1; > } > } > > public static long[] toLongs(byte[] value, int offset, int length) { > int num = (length - offset) / 8; > long[] values = new long[num]; > for (int i = offset; i < num; i++) { > values[i] = getLong(value, i * 8); > } > return values; > } > > > Strings are similar but would require charset and length for each string. > > public class StringsComparator extends WritableByteArrayComparable { > public int compareTo(byte[] value, int offset, int length) { > String[] values = BytesUtils.toStrings(value, offset, length); > for (String stringValue : values) { > if (val.equals(stringValue)) { > return 0; > } > } > return 1; > } > } > > public static String[] toStrings(byte[] value, int offset, int length) { > ArrayList<String> values = new ArrayList<String>(); > int idx = 0; > ByteBuffer buffer = ByteBuffer.wrap(value, offset, length); > while (idx < length) { > int size = buffer.getInt(); > byte[] bytes = new byte[size]; > buffer.get(bytes); > values.add(new String(bytes)); > idx += 4 + size; > } > return values.toArray(new String[values.size()]); > } > > > Am I on the right track or maybe overlooking some implementation details? > Not really sure how to target each comparator to a specific column value? > > > On Thu, Jun 27, 2013 at 9:21 PM, Michael Segel > <[email protected]>wrote: > >> Not an easy task. >> >> You first need to determine how you want to store the data within a column >> and/or apply a type constraint to a column. >> >> Even if you use JSON records to store your data within a column, does an >> equality comparator exist? If not, you would have to write one. >> (I kinda think that one may already exist...) >> >> >> On Jun 27, 2013, at 12:59 PM, Kristoffer Sjögren <[email protected]> wrote: >> >>> Hi >>> >>> Working with the standard filtering mechanism to scan rows that have >>> columns matching certain criterias. >>> >>> There are columns of numeric (integer and decimal) and string types. >> These >>> columns are single or multi-valued like "1", "2", "1,2,3", "a", "b" or >>> "a,b,c" - not sure what the separator would be in the case of list types. >>> Maybe none? >>> >>> I would like to compose the following queries to filter out rows that >> does >>> not match. >>> >>> - contains(String column, String value) >>> Single valued column that String.contain() provided value. >>> >>> - equal(String column, Object value) >>> Single valued column that Object.equals() provided value. >>> Value is either string or numeric type. >>> >>> - greaterThan(String column, java.lang.Number value) >>> Single valued column that > provided numeric value. >>> >>> - in(String column, Object value...) >>> Multi-valued column have values that Object.equals() all provided >> values. >>> Values are of string or numeric type. >>> >>> How would I design a schema that can take advantage of the already >> existing >>> filters and comparators to accomplish this? >>> >>> Already looked at the string and binary comparators but fail to see how >> to >>> solve this in a clean way for multi-valued column values. >>> >>> Im aware of custom filters but would like to avoid it if possible. >>> >>> Cheers, >>> -Kristoffer >> >>
