One snag is that org.apache.hadoop.hbase.regionserver.InternalScanner does
not return a Result, so we need some other mechanism to pass the Cursor
internally.
Options I can think of:
* Keep the current Dummy cell
* Create a new Cursor Cell type based on EmptyCell for the same purpose
(which would be slightly faster)



On Fri, Nov 21, 2025 at 9:04 AM Istvan Toth <[email protected]> wrote:

> That doesn't help us with the timeout setting.
>
> My plan is to remove the time field and logic from PhoenixScannerContext
> and rely on the existing one in ScannerContext as the heartbeat logic uses
> that one.
>
> The lack of setter is not a show stopper, I'm pretty sure we can set that
> via reflection if all else fails.
>
> Istvan
>
>
> On Fri, Nov 21, 2025 at 8:04 AM Tanuj Khurana <[email protected]> wrote:
>
>> Hi Istvan,
>>
>> As part of PHOENIX-7707, Phoenix extends the ScannerContext so that we can
>> add custom fields to it.
>>
>> On Fri, 21 Nov 2025 at 10:53, Istvan Toth <[email protected]> wrote:
>>
>> > Thanks for these points, Kadir.
>> >
>> > On Thu, Nov 20, 2025 at 8:59 PM Kadir Ozdemir <
>> > [email protected]>
>> > wrote:
>> >
>> > > The row key of the dummy result is not simply the last row that was
>> > scanned
>> > > by RegionScannerImpl. It is computed by Phoenix coprocs based on the
>> > query.
>> > > For example, for an ordered group by query, it should be the last row
>> of
>> > > the last group computed. For an unordered group by query, it nevers
>> > changes
>> > > until the entire region is processed.  For Phoenix to be able to use
>> the
>> > > HBase cursor, the coprocs needs to be able to change the cursor value.
>> > > Otherwise, there will be data integrity issues.
>> > >
>> >
>> > We can create a new synthetic Cursor Result and return that the same
>> way we
>> > create and return a new dummy Cell now.
>> > In this regard I see no difference between the two.
>> >
>> >
>> > >
>> > > Another reason for the dummy result is to provide an end-to-end fair
>> > > scheduling for Phoenix in future. Without a Phoenix level signal (the
>> > dummy
>> > > result), the Phoenix client would not know if the server already spent
>> > the
>> > > page time for a given query. I was thinking that we may be able to
>> > leverage
>> > > this to decide if the current blocked thread should be released. This
>> is
>> > a
>> > > secondary concern but I want to make sure we all understand the
>> > > implications of replacing this Phoenix level concept.
>> > >
>> >
>> > Good point.
>> >
>> > My current understanding is that if we set the *needCursorResult* flag
>> on
>> > the scan,
>> > then HBase will return all cursor results to the client, and we can use
>> > those the same way
>> > we use the dummy cells, so I see no problem here either.
>> >
>> > In fact the more I look the Hbase Heartbeat/cursor implementation, the
>> more
>> > it feels like it was
>> > taylor made for implementing Phoenix paging (even though it was not
>> coming
>> > from Phoenix developers)
>> >
>> > The only snag I've found so far is that HBase creates the default
>> > ScannerContext and there is no easy way
>> > to set a custom paging time on it.
>> >
>> >
>> > > On Wed, Nov 19, 2025 at 9:27 PM Istvan Toth
>> <[email protected]>
>> > > wrote:
>> > >
>> > > > I'm glad that you as the original designer of the feature has joined
>> > the
>> > > > discussion, Kadir.
>> > > >
>> > > > On Wed, Nov 19, 2025 at 10:56 PM Kadir Ozdemir <[email protected]>
>> > wrote:
>> > > >
>> > > > > Istvan,
>> > > > >
>> > > > > When I introduced server paging and the dummy result, Phoenix did
>> not
>> > > > > support ScannerContext. Now that Phoenix supports ScannerContext,
>> we
>> > > can
>> > > > > think about leveraging it better for server paging.
>> > > > >
>> > > >
>> > > > I realize that it was necessary for HBase 1.x. This was a good
>> design
>> > > when
>> > > > HBase 1
>> > > > support was a requirement, but specifically the dummy cell
>> > implementation
>> > > > detail
>> > > > is redundant now that HBase 2+ has native support for the same
>> > > > functionality.
>> > > >
>> > > >
>> > > > >
>> > > > > "HBase is not aware that these are dummy cells, and is considering
>> > the
>> > > > rows
>> > > > > as already processed when retrying scans after the region goes
>> away
>> > > from
>> > > > > under the scan, i.e. it restarts the scan from AFTER the dummy
>> cell's
>> > > > > rowkey, leading to the scan skipping rows."
>> > > > >
>> > > >
>> > > > This assumption is no longer true in HBase 3.
>> > > >
>> > > > The client side heartbeat logic in HBase 3 is thrown off by the
>> dummy
>> > > cells
>> > > > generated by Phoenix.
>> > > >
>> > > > I had to add this hack to get some tests in Phoenix to pass:
>> > > > https://github.com/stoty/hbase/tree/PHOENIX_DUMMY_CELL_WORKAROUND
>> > > >
>> > > >
>> > > > >
>> > > > > That is the whole purpose of the dummy result, that is, not to
>> scan
>> > the
>> > > > > rows that have been scanned already. This allows Phoenix to make
>> > > progress
>> > > > > in the presence of table region movements, otherwise every time a
>> > > region
>> > > > > moves or splits, Phoenix has to scan the region from the row key
>> of
>> > the
>> > > > > last valid result from this region instead of the last scanned
>> row.
>> > > What
>> > > > is
>> > > > > the problem with this? Consider a large region and a scan with a
>> very
>> > > > > selective filter such that a large number of rows need to be
>> scanned
>> > > > before
>> > > > > returning a valid row. One can create a sequence of region
>> movements
>> > > that
>> > > > > prevents Phoenix from making any progress for this scan
>> > > >
>> > > >
>> > > > Thanks for the explanation.
>> > > >
>> > > > I'm not questioning the usefulness of the paging design.
>> > > > The HBase community also agrees, so they have added this feature
>> > natively
>> > > > in
>> > > > HBase 2 in the form of the heartbeat/cursor feature.
>> > > >
>> > > > .
>> > > > >
>> > > > > Please note that Phoenix has some complex logic on the server side
>> > for
>> > > > > handling various SQL language features including grouping,
>> > aggregating,
>> > > > > sorting and joining. Implementing paging is much more complex in
>> > > Phoenix
>> > > > > than implementing keep alive and ScannerContext in HBase. Either
>> you
>> > > > > discovered an issue in Phoenix paging or a compatibility issue
>> > between
>> > > > > HBase 2 and HBase 3. I suggest that we understand what the issue
>> is
>> > > first
>> > > > > before replacing the dummy result.
>> > > > >
>> > > >
>> > > > It is the latter.
>> > > >
>> > > > The internal heartbeat retry logic in the HBase 3 client sees the
>> dummy
>> > > row
>> > > > and concludes that
>> > > > it should continue after an error (i.e. region move) from AFTER that
>> > row.
>> > > > (see my HBase hack above)
>> > > >
>> > > > This is different from the HBase 2 logic, which does not do this.
>> > > >
>> > > > In a way, this is related to, and sometimes casued by another
>> Phoenix
>> > > > change I have made for HBase 3:
>> > > > PHOENIX-7728 <https://issues.apache.org/jira/browse/PHOENIX-7728>
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/stoty/phoenix/blob/62112097bc1f050a760225663001fc0f084d4fb4/phoenix-core-server/src/main/java/org/apache/phoenix/coprocessor/GroupedAggregateRegionObserver.java#L482
>> > > > <https://issues.apache.org/jira/browse/PHOENIX-7728>
>> > > >
>> > > > However, without removing the plus/minus row logic there even more
>> > tests
>> > > > were failing, so
>> > > > Hbase 3 doesn't work with the current Phoenix dummy row logic
>> either.
>> > > >
>> > > > <https://issues.apache.org/jira/browse/PHOENIX-7728>I agree that
>> > Phoenix
>> > > > still needs to be aware of paging, and will need logic to convert
>> the
>> > > > Cursor rowkeys returned from inner scanners into rowkeys that make
>> > sense
>> > > > for the outer scanners and client, but
>> > > > my expectation is that we can simply? convert the current Dummy cell
>> > > logic
>> > > > that handles this to work with the
>> > > > cursor value instead on the server side.
>> > > >
>> > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Wed, Nov 19, 2025 at 7:23 AM Tanuj Khurana <
>> [email protected]>
>> > > > wrote:
>> > > > >
>> > > > > > Hi Istvan,
>> > > > > >
>> > > > > > I agree that instead of using dummy cells, we should rely on
>> > > > > > keepalive/cursor mechanics. We have been working towards that.
>> As
>> > > part
>> > > > of
>> > > > > > PHOENIX-7707, I  propagated the scanner context all the way
>> down to
>> > > > > phoenix
>> > > > > > scanners. We can leverage that.
>> > > > > >
>> > > > > > Tanuj
>> > > > > >
>> > > > > > On Wed, 19 Nov 2025 at 20:09, Istvan Toth
>> > <[email protected]
>> > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Thanks for your thoughtful response, Viraj.
>> > > > > > >
>> > > > > > > I have added my thoughts below.
>> > > > > > >
>> > > > > > > On Wed, Nov 19, 2025 at 2:38 PM Viraj Jasani <
>> [email protected]
>> > >
>> > > > > wrote:
>> > > > > > >
>> > > > > > > > We also need to understand: what happens when hbase client
>> gets
>> > > > > > heartbeat
>> > > > > > > > and the region moves?
>> > > > > > > >
>> > > > > > > > I have checked that code in HBase, and the HBase client
>> seems
>> > to
>> > > > > handle
>> > > > > > > this case transparently.
>> > > > > > > We may of course find bugs, but handling that is part of the
>> > > design.
>> > > > > > >
>> > > > > > >
>> > > > > > > >
>> > > > > > > > On Wed, Nov 19, 2025 at 7:05 PM Viraj Jasani <
>> > [email protected]
>> > > >
>> > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Istvan, I think we should also involve dev@hbase and see
>> > what
>> > > > > > > guidelines
>> > > > > > > > > we are recommending so far for coprocs that would like to
>> > > > implement
>> > > > > > > > timeout
>> > > > > > > > > features for long running scans, wdyt?
>> > > > > > > >
>> > > > > > >
>> > > > > > > Based on my current understanding, if the Scan /
>> ScannerContext
>> > is
>> > > > > > > correctly set up (allows partial rows, sets the time limit and
>> > > > > requests a
>> > > > > > > cursor),
>> > > > > > > HBase will honor that and the Scan will return a heartbeat
>> result
>> > > > when
>> > > > > it
>> > > > > > > times out.
>> > > > > > >
>> > > > > > > I THINK that's all we need. Of course if we get stuck we
>> should
>> > ask
>> > > > for
>> > > > > > > help.
>> > > > > > >
>> > > > > > >
>> > > > > > > > >
>> > > > > > > > > On Wed, Nov 19, 2025 at 6:51 PM Viraj Jasani <
>> > > [email protected]
>> > > > >
>> > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > >> Thank you for starting this thread, Istvan!
>> > > > > > > > >>
>> > > > > > > > >> This is an important issue. I have recently come across
>> data
>> > > > > > > correctness
>> > > > > > > > >> issues with PHOENIX-7733, to be fixed by HBASE-29722.
>> This
>> > > also
>> > > > > got
>> > > > > > me
>> > > > > > > > >> thinking about the heartbeat and dummy cell overlap
>> leading
>> > to
>> > > > > > > possible
>> > > > > > > > >> data correctness issues.
>> > > > > > > > >>
>> > > > > > > > >> > I propose dropping the dummy cell mechanics from
>> Phoenix,
>> > > and
>> > > > > > using
>> > > > > > > > the
>> > > > > > > > >> > HBase keepalive/cursor mechanics instead (we may not
>> even
>> > > need
>> > > > > the
>> > > > > > > > >> cursors).
>> > > > > > > > >>
>> > > > > > > > >> +1
>> > > > > > > > >>
>> > > > > > > > >> > If we cannot find a better way to shortcut some
>> processing
>> > > in
>> > > > > > > Phoenix
>> > > > > > > > we
>> > > > > > > > >> > may need to keep dummy cells internally, but we have to
>> > make
>> > > > > sure
>> > > > > > > that
>> > > > > > > > >> they
>> > > > > > > > >> > never appear on the wire and reach the client.
>> > > > > > > > >>
>> > > > > > > > >> I don't think it is possible for Phoenix to ensure a
>> dummy
>> > > cell
>> > > > > > never
>> > > > > > > > >> reaches the HBase client.
>> > > > > > > >
>> > > > > > >
>> > > > > > > I think if nothing else works, we can still catch and
>> > > filter/convert
>> > > > > them
>> > > > > > > in RegionObserver.postScannerNext().
>> > > > > > > Of course ideally we would never generate any Dummy cells in
>> the
>> > > > first
>> > > > > > > place.
>> > > > > > >
>> > > > > > >
>> > > > > > > > >>
>> > > > > > > > >> > in that case we'd need
>> > > > > > > > >> > to check and convert to a heartbeat scan result somehow
>> > > > > > > > >>
>> > > > > > > > >> This needs changes in HBase only, which I don't think
>> HBase
>> > > > would
>> > > > > > > > >> (should) allow.
>> > > > > > > > >>
>> > > > > > > > >> > Is Hbase 2/3 wire compatible enough that connecting
>> with
>> > > HBase
>> > > > > 2.x
>> > > > > > > > >> clients
>> > > > > > > > >> > to Hbase 3 even a possibility ?
>> > > > > > > > >>
>> > > > > > > > >> Yes, wire compatibility is important. When this happens,
>> the
>> > > > only
>> > > > > > > thing
>> > > > > > > > >> we can do is set the page timeout high enough that we
>> never
>> > > have
>> > > > > to
>> > > > > > > send
>> > > > > > > > >> the dummy result to the client, or disable the paging
>> > feature.
>> > > > > > > > >>
>> > > > > > > > >>
>> > > > > > > > >> On Thu, Nov 13, 2025 at 11:22 PM Istvan Toth <
>> > > [email protected]>
>> > > > > > > wrote:
>> > > > > > > > >>
>> > > > > > > > >>> I've been struggling with errors on the region moving
>> tests
>> > > on
>> > > > my
>> > > > > > > HBase
>> > > > > > > > >>> 3.0
>> > > > > > > > >>> WIP branch and have finally tracked the problems down to
>> > > > > Phoenix's
>> > > > > > > > dummy
>> > > > > > > > >>> Cells (as well as some built-in assumptions in Phoenix
>> > which
>> > > > are
>> > > > > > not
>> > > > > > > > true
>> > > > > > > > >>> for Hbase 3, see PHOENIX-7728
>> > > > > > > > >>> <https://issues.apache.org/jira/browse/PHOENIX-7728>)
>> > > > > > > > >>>
>> > > > > > > > >>> HBase is not aware that these are dummy cells, and is
>> > > > considering
>> > > > > > the
>> > > > > > > > >>> rows
>> > > > > > > > >>> as already processed when retrying scans after the
>> region
>> > > goes
>> > > > > away
>> > > > > > > > from
>> > > > > > > > >>> under the scan, i.e. it restarts the scan from AFTER the
>> > > dummy
>> > > > > > cell's
>> > > > > > > > >>> rowkey, leading to the scan skipping rows.
>> > > > > > > > >>>
>> > > > > > > > >>> I have been able to fix the tests by hacking Hbase to
>> > ignore
>> > > > > these
>> > > > > > > > dummy
>> > > > > > > > >>> cells (and fixing the phoenix side problems described in
>> > > > > > PHOENIX-7728
>> > > > > > > > >>> <https://issues.apache.org/jira/browse/PHOENIX-7728>),
>> > but I
>> > > > > don't
>> > > > > > > > think
>> > > > > > > > >>> that hacking HBase to work with dummy cells is the way
>> to
>> > go
>> > > > (or
>> > > > > > even
>> > > > > > > > if
>> > > > > > > > >>> that would be accepted by HBase).
>> > > > > > > > >>>
>> > > > > > > > >>> AFAIU the dummy cells were added back in the HBase 1.x
>> when
>> > > > there
>> > > > > > was
>> > > > > > > > no
>> > > > > > > > >>> other way to ensure timely responses from the server.
>> > > > > > > > >>>
>> > > > > > > > >>> HBase 2 has introduced the keepalive/cursor mechanics,
>> > which
>> > > > IUC
>> > > > > > > serves
>> > > > > > > > >>> the
>> > > > > > > > >>> exact same purpose at the Phoenix dummy cells.
>> > > > > > > > >>>
>> > > > > > > > >>> I propose dropping the dummy cell mechanics from
>> Phoenix,
>> > and
>> > > > > using
>> > > > > > > the
>> > > > > > > > >>> HBase keepalive/cursor mechanics instead (we may not
>> even
>> > > need
>> > > > > the
>> > > > > > > > >>> cursors).
>> > > > > > > > >>>
>> > > > > > > > >>> If we cannot find a better way to shortcut some
>> processing
>> > in
>> > > > > > Phoenix
>> > > > > > > > we
>> > > > > > > > >>> may need to keep dummy cells internally, but we have to
>> > make
>> > > > sure
>> > > > > > > that
>> > > > > > > > >>> they
>> > > > > > > > >>> never appear on the wire and reach the client. (i.e. in
>> > that
>> > > > case
>> > > > > > > we'd
>> > > > > > > > >>> need
>> > > > > > > > >>> to check and convert to a heartbeat scan result somehow)
>> > > > > > > > >>>
>> > > > > > > > >>> We will also need to consider backwards compatibility.
>> > > > > > > > >>>
>> > > > > > > > >>> Is Hbase 2/3 wire compatible enough that connecting with
>> > > HBase
>> > > > > 2.x
>> > > > > > > > >>> clients
>> > > > > > > > >>> to Hbase 3 even a possibility ?
>> > > > > > > > >>>
>> > > > > > > > >>> Do we want to support that ?
>> > > > > > > > >>>
>> > > > > > > > >>> When using Hbase 2.x, if Phoenix starts to use the HBase
>> > > > > keepalive
>> > > > > > > > >>> mechanics, will old clients work with that without
>> changes,
>> > > or
>> > > > do
>> > > > > > we
>> > > > > > > > need
>> > > > > > > > >>> to keep sending Dummy cells for older clients ?
>> > > > > > > > >>>
>> > > > > > > > >>> Looking forward to hearing your take,
>> > > > > > > > >>>
>> > > > > > > > >>> Istvan
>> > > > > > > > >>>
>> > > > > > > > >>
>> > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > --
>> > > > > > > *István Tóth* | Sr. Staff Software Engineer
>> > > > > > > *Email*: [email protected]
>> > > > > > > cloudera.com <https://www.cloudera.com>
>> > > > > > > [image: Cloudera] <https://www.cloudera.com/>
>> > > > > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera>
>> > > [image:
>> > > > > > > Cloudera on Facebook] <https://www.facebook.com/cloudera>
>> > [image:
>> > > > > > Cloudera
>> > > > > > > on LinkedIn] <https://www.linkedin.com/company/cloudera>
>> > > > > > > ------------------------------
>> > > > > > > ------------------------------
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > *István Tóth* | Sr. Staff Software Engineer
>> > > > *Email*: [email protected]
>> > > > cloudera.com <https://www.cloudera.com>
>> > > > [image: Cloudera] <https://www.cloudera.com/>
>> > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
>> > > > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
>> > > Cloudera
>> > > > on LinkedIn] <https://www.linkedin.com/company/cloudera>
>> > > > ------------------------------
>> > > > ------------------------------
>> > > >
>> > >
>> >
>> >
>> > --
>> > *István Tóth* | Sr. Staff Software Engineer
>> > *Email*: [email protected]
>> > cloudera.com <https://www.cloudera.com>
>> > [image: Cloudera] <https://www.cloudera.com/>
>> > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
>> > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
>> Cloudera
>> > on LinkedIn] <https://www.linkedin.com/company/cloudera>
>> > ------------------------------
>> > ------------------------------
>> >
>>
>
>
> --
> *István Tóth* | Sr. Staff Software Engineer
> *Email*: [email protected]
> cloudera.com <https://www.cloudera.com>
> [image: Cloudera] <https://www.cloudera.com/>
> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera>
> ------------------------------
> ------------------------------
>


-- 
*István Tóth* | Sr. Staff Software Engineer
*Email*: [email protected]
cloudera.com <https://www.cloudera.com>
[image: Cloudera] <https://www.cloudera.com/>
[image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
on LinkedIn] <https://www.linkedin.com/company/cloudera>
------------------------------
------------------------------

Reply via email to