Over in HBASE-20334 I'm trying to include a simple example program
that does a series of Puts to a test table.
I figured I'd try to use the Cell API to do this, but I've gotten confused.
Let's say I just want to insert SOME_NUMBER rows, with as little
specified as possible. Using just public API:
final CellBuilder builder =
CellBuilderFactory.create(CellBuilderType.SHALLOW_COPY);
for (int i = 0; i < SOME_NUMBER; i++) {
builder.clear();
final byte[] row = Bytes.toBytes(i);
final Put put = new Put(row);
builder.setRow(row);
builder.setFamily(FAMILY_BYTES);
put.add(builder.build());
table.put(put);
}
the above will fail with an IllegalArgumentException:
Exception in thread "main" java.lang.IllegalArgumentException: The
type can't be NULL
at
org.apache.hadoop.hbase.ExtendedCellBuilderImpl.checkBeforeBuild(ExtendedCellBuilderImpl.java:143)
at
org.apache.hadoop.hbase.ExtendedCellBuilderImpl.build(ExtendedCellBuilderImpl.java:151)
at
org.apache.hadoop.hbase.ExtendedCellBuilderImpl.build(ExtendedCellBuilderImpl.java:25)
Looking at the public API javadocs, I don't know what I'm supposed to
put in the call for setType. the version that takes a byte gives no
explanation and the Cell.Type enum is IA.Private, so not in the
javadocs.
I thought maybe the ref guide would have an example I could use, but
all the Put examples there use the method that takes a bunch of byte
arrays instead of a cell.[1]
If I update the example to use Cell.Type.Put then it works.
Before I go to update the docs, can someone give me some pointers? How
am I supposed to get a Cell.Type through public APIs?
Should we have folks avoid using the Put#add(Cell) method if they
don't already have a Cell instance?
If Cell is fine to use, could we update this to be less repetitive?
-busbey
[1]: Additionally, with the exception of the spark integration chapter
the Put examples also appear to be wrong, since they refer to the
method as "add(byte[],byte[],byte[])" when it's now
"addColumn(byte[],byte[],byte[])".