FYI, a couple of timestamp related features that Phoenix supports today
- specify/filter on timestamp of a row:
- query as of a past timestamp:
These were determined to be a good fit with SQL and surface some of the
power of HBase. Exposing per cell timestamp control and multi-version
queries are difficult in the SQL model, but we're open to suggestions if it
can be done in a standard, general way.
On Sunday, October 16, 2016, William <yhxx...@163.com> wrote:
> Hi, Zhang Yang,
> I've implemented the multi-version feature in my own Phoenix branch.
> But this implementation is supposed to be working in a very very limited
> scenario because there were so many things to think about when designing
> it. Here are some primary problems that we must solve:
> * add new syntax to support select with timestamps, we should support
> select only one version and multi version within a range and the number of
> versions too. For example:
> select * from test timestamps min, max; // select all versions
> within the specified time range
> select * from test timestamps ts; // select a specified
> select * from test version number; // select specified
> number of versions
> select * from test version number timestamps min, max; // select
> specified number of versions with a specified time range.
> Note that this is not standard SQL syntax, which is not recommended.
> * Timestamp is a Cell-level property in HBase, so we should support the
> same thing in Phoenix. But how can we allow different timestamps for
> different columns in the same row? I modified the ResultSet class and add
> some methods like 'public Map<Long, T> getAllT(index)' to return all
> selected versions for a single column. One can call this method on
> different columns for the same row to retrieve all the things he wants.
> Users must use PhoenixResultSet instead of ResultSet, this is not
> recommended either.
> * How do we handle index updates/selects for multi-version? This is a
> messy problem, so my implementation did not support multi-version for index
> * do not support GROUP BY, ORDER BY or any nested query/upsert.
> * for batch commit, when you upsert the same row with different
> timestamps, Phoenix can only commit the last timestamps you set. This is
> meaningless to do this. So I simply forbid this scenario.
> * Phoenix encoded the KVs into one Cell at the RS side, but if we want
> to return multi-versions for different columns, especially different
> timestamps for different columns, we must not do the encoding. So we must
> modify the internals of Phoenix to support a brand new read path to do this.
> Besides the huge efforts of implementing, IMHO, the primary problem is
> it's not easy to implement this feature properly, as each one may have a
> different requirement. You can implementing this feature personally in your
> personal branch, but i don't know the best way to support this in an
> official Phoenix release. What do you think of this? Any suggested design?
> At 2016-10-13 18:12:56, "Yang Zhang" <zhang.yang...@gmail.com
> Hello everyone
> I saw that we can create a Phoenix table from an exist HBase table,(for
> My question is whether Phoenix can supprort the history version of my row?
> I am trying to use Phoenix to store some info which have a lot of common
> such as a table "T1 ( c1, c2, c3, c4 )", many rows share the same
> c1,c2,c3,and the variable column is c4,
> Using HBase we can put 'T1', 'key1', ' f:c4', 'new value', timestamp,
> And i can get previous version of this row,They all share the same
> c1,c2,c3 whice HBase only store once.
> Whether phoenix support to query history version of my row?
> I got this jira link <https://issues.apache.org/jira/browse/PHOENIX-590>
> , This is same as my question.
> Hadoop is using for big data, and mlutiple version can help us reduce our
> date that unnecessary
> I think phoenix should support this feature too.
> If Phoenix shouldn't support multiple version, please tell me the reason.
> Anyway thansks for your help, First