Ok... If you want to do type checking and schema enforcement...
You will need to do this as a coprocessor. The quick and dirty way... (Not recommended) would be to hard code the schema in to the co-processor code.) A better way... at start up, load up ZK to manage the set of known table schemas which would be a map of column qualifier to data type. (If JSON then you need to do a separate lookup to get the records schema) Then a single java class that does the look up and then handles the known data type comparators. Does this make sense? (Sorry, kinda was thinking this out as I typed the response. But it should work ) At least it would be a design approach I would talk. YMMV Having said that, I expect someone to say its a bad idea and that they have a better solution. HTH -Mike On Jun 27, 2013, at 5:13 PM, Kristoffer Sjögren <sto...@gmail.com> wrote: > I see your point. Everything is just bytes. > > However, the schema is known and every row is formatted according to this > schema, although some columns may not exist, that is, no value exist for > this property on this row. > > So if im able to apply these "typed comparators" to the right cell values > it may be possible? But I cant find a filter that target specific columns? > > Seems like all filters scan every column/qualifier and there is no way of > knowing what column is currently being evaluated? > > > On Thu, Jun 27, 2013 at 11:51 PM, Michael Segel > <michael_se...@hotmail.com>wrote: > >> You have to remember that HBase doesn't enforce any sort of typing. >> That's why this can be difficult. >> >> You'd have to write a coprocessor to enforce a schema on a table. >> Even then YMMV if you're writing JSON structures to a column because while >> the contents of the structures could be the same, the actual strings could >> differ. >> >> HTH >> >> -Mike >> >> On Jun 27, 2013, at 4:41 PM, Kristoffer Sjögren <sto...@gmail.com> wrote: >> >>> I realize standard comparators cannot solve this. >>> >>> However I do know the type of each column so writing custom list >>> comparators for boolean, char, byte, short, int, long, float, double >> seems >>> quite straightforward. >>> >>> Long arrays, for example, are stored as a byte array with 8 bytes per >> item >>> so a comparator might look like this. >>> >>> public class LongsComparator extends WritableByteArrayComparable { >>> public int compareTo(byte[] value, int offset, int length) { >>> long[] values = BytesUtils.toLongs(value, offset, length); >>> for (long longValue : values) { >>> if (longValue == val) { >>> return 0; >>> } >>> } >>> return 1; >>> } >>> } >>> >>> public static long[] toLongs(byte[] value, int offset, int length) { >>> int num = (length - offset) / 8; >>> long[] values = new long[num]; >>> for (int i = offset; i < num; i++) { >>> values[i] = getLong(value, i * 8); >>> } >>> return values; >>> } >>> >>> >>> Strings are similar but would require charset and length for each string. >>> >>> public class StringsComparator extends WritableByteArrayComparable { >>> public int compareTo(byte[] value, int offset, int length) { >>> String[] values = BytesUtils.toStrings(value, offset, length); >>> for (String stringValue : values) { >>> if (val.equals(stringValue)) { >>> return 0; >>> } >>> } >>> return 1; >>> } >>> } >>> >>> public static String[] toStrings(byte[] value, int offset, int length) { >>> ArrayList<String> values = new ArrayList<String>(); >>> int idx = 0; >>> ByteBuffer buffer = ByteBuffer.wrap(value, offset, length); >>> while (idx < length) { >>> int size = buffer.getInt(); >>> byte[] bytes = new byte[size]; >>> buffer.get(bytes); >>> values.add(new String(bytes)); >>> idx += 4 + size; >>> } >>> return values.toArray(new String[values.size()]); >>> } >>> >>> >>> Am I on the right track or maybe overlooking some implementation details? >>> Not really sure how to target each comparator to a specific column value? >>> >>> >>> On Thu, Jun 27, 2013 at 9:21 PM, Michael Segel < >> michael_se...@hotmail.com>wrote: >>> >>>> Not an easy task. >>>> >>>> You first need to determine how you want to store the data within a >> column >>>> and/or apply a type constraint to a column. >>>> >>>> Even if you use JSON records to store your data within a column, does an >>>> equality comparator exist? If not, you would have to write one. >>>> (I kinda think that one may already exist...) >>>> >>>> >>>> On Jun 27, 2013, at 12:59 PM, Kristoffer Sjögren <sto...@gmail.com> >> wrote: >>>> >>>>> Hi >>>>> >>>>> Working with the standard filtering mechanism to scan rows that have >>>>> columns matching certain criterias. >>>>> >>>>> There are columns of numeric (integer and decimal) and string types. >>>> These >>>>> columns are single or multi-valued like "1", "2", "1,2,3", "a", "b" or >>>>> "a,b,c" - not sure what the separator would be in the case of list >> types. >>>>> Maybe none? >>>>> >>>>> I would like to compose the following queries to filter out rows that >>>> does >>>>> not match. >>>>> >>>>> - contains(String column, String value) >>>>> Single valued column that String.contain() provided value. >>>>> >>>>> - equal(String column, Object value) >>>>> Single valued column that Object.equals() provided value. >>>>> Value is either string or numeric type. >>>>> >>>>> - greaterThan(String column, java.lang.Number value) >>>>> Single valued column that > provided numeric value. >>>>> >>>>> - in(String column, Object value...) >>>>> Multi-valued column have values that Object.equals() all provided >>>> values. >>>>> Values are of string or numeric type. >>>>> >>>>> How would I design a schema that can take advantage of the already >>>> existing >>>>> filters and comparators to accomplish this? >>>>> >>>>> Already looked at the string and binary comparators but fail to see how >>>> to >>>>> solve this in a clean way for multi-valued column values. >>>>> >>>>> Im aware of custom filters but would like to avoid it if possible. >>>>> >>>>> Cheers, >>>>> -Kristoffer >>>> >>>> >> >>