Gustavo Anatoly helped me a lot here :) Sorry for chatting off the mailing list...
Summarizing: James Taylor answered why there's a (non-intuitive) _0 column qualifier in the default column family here: https://groups.google.com/forum/#!topic/phoenix-hbase-user/wCeljAvLekc. Thus, if I understood correctly, I can avoid to care about that field (unless I perform inserts by-passing Phoenix). The other doubt I had was about how to read arrays from byte[] in mapreduce jobs. Fortunately, that is not very difficult if I know the type in advance (es VARCHAR array). For example: public void map(final ImmutableBytesWritable rowKey, final Result columns, final Context context) throws IOException, InterruptedException { ... final byte[] bytes = columns.getValue(Bytes.toBytes(columnFamily)), Bytes.toBytes(columnQual)); final PhoenixArray resultArr = (PhoenixArray) PDataType.VARCHAR_ARRAY.toObject(bytes, 0, bytes.length); Thanks again to Gustavo for the help, Flavio On Thu, Sep 11, 2014 at 4:31 PM, Krishna <research...@gmail.com> wrote: > I assume you are referring to the bulk loader. "-a" option allows you to > pass array delimiter. > > > On Thursday, September 11, 2014, Flavio Pompermaier <pomperma...@okkam.it> > wrote: > >> Any help about this..? >> What if I save a field as an array? how could I read it from a mapreduce >> job? Is there a separator char to use for splitting or what? >> >> On Tue, Sep 9, 2014 at 10:36 AM, Flavio Pompermaier <pomperma...@okkam.it >> > wrote: >> >>> Hi to all, >>> >>> I'd like to know which is the correct way to run a mapreduce job on a >>> table managed by phoenix to put data in another table (always managed by >>> Phoenix). >>> Is it sufficient to read data contained in column 0 (like 0:id, 0:value) >>> and create insert statements in the reducer to put things correctly in the >>> output table? >>> Should I filter rows containing some special value for ccolumn 0:_0..? >>> >>> Best, >>> FP >>> >>