The best way is to go through a supported interface, as the existing map-reduce integration does. You can define a schema through a DDL statement on a connection-less connection (see ConnectionlessTest.testConnectionlessUpsert), upsert values into it, and use semi-public APIs in PhoenixRuntime to get the List<KeyValue>.
Our serialization format is described here: http://phoenix.apache.org/language/datatypes.html. Combining primitive types in the the row key is done through concatenating the serialized values. A zero byte is used as a terminator of variable length types with any trailing zero bytes stripped. Fixed length types have no terminator byte. Thanks, James On Sun, Jan 25, 2015 at 7:38 PM, Lin Feng <[email protected]> wrote: > Thanks Eli, James for your feedback! > > @James, regarding your second point, can you point me to some > documentation/example of interpreting Phoenix generated row key? > > Thanks again! > > > On Sun, Jan 25, 2015 at 12:49 PM, James Taylor <[email protected]> > wrote: >> >> You can also use our map reduce integration which will use secondary >> indexes automatically/transparently just as is done when using SQL APIs. >> >> If you use map reduce outside of this against secondary indexes, then >> you'll need to interpret the row key correctly. >> >> >> On Sunday, January 25, 2015, Eli Levine <[email protected]> wrote: >>> >>> Yes, should be possible, since secondary indexes are themselves Phoenix >>> tables. >>> >>> >>> >>> > On Jan 25, 2015, at 5:42 AM, Lin Feng <[email protected]> wrote: >>> > >>> > We have been using house grown secondary indexes on HBase tables in our >>> > application. >>> > I am wondering if we switch to Phoenix and let Phoenix creates and >>> > maintains new secondary >>> > indexes, can a MapReduce job take advantage of these indexes without >>> > using the Phoenix >>> > API? > >
