Re: Why do we need an empty column when doing upsert?

Gabriel Reid Mon, 21 Dec 2015 06:47:21 -0800

Hi William,

The empty column is needed to ensure that a given column is available
for all rows.

As you may know, data is stored in HBase as KeyValues, meaning that
the full row key is stored for each column value. This also implies
that the row key is not stored at all unless there is at least one
column stored.

Now consider JDBC row which has an integer primary key, and several
columns which are all null. In order to be able to store the primary
key, a KeyValue needs to be stored to show that the row is present at
all. This column is represented by the empty column that you've
noticed. This allows doing a "SELECT * FROM TABLE" and receiving
records for all rows, even those whose non-pk columns are null.

The same issue comes up even if only one column is null for some (or
all) records. A scan over Phoenix will include the empty column to
ensure that rows that only consist of the primary key (and have null
for all non-key columns) will be included in a scan result.

- Gabriel

On Mon, Dec 21, 2015 at 2:58 PM, 杨晗 <yhxx...@163.com> wrote:
> hi all:
>      I'm reading phoenix src code recently, and i found 
> PRowImpl.toRowMutations() always adds an empty column named '_0' for 
> non-delete upsert. Why?
>      I read the comment but i didn't quite understand it. Might someone give 
> me an example that illustates why an empty column is ALWAYS necessary?
>
>
>     Further more, I have to access a phoenix table by both phoenix client and 
> hbase API in some cases. If i do not add this empty column explicitly by 
> hbase API, it is ok  if i read this row by phoenix?
>
>
> Thanks
> -William

Re: Why do we need an empty column when doing upsert?

Reply via email to