[
https://issues.apache.org/jira/browse/PHOENIX-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890793#comment-15890793
]
James Taylor commented on PHOENIX-3649:
---------------------------------------
We set the cell timestamp in MutationState (based on return of
MutationState.validate()) so that all of the mutations for an UPSERT SELECT
have a consistent timestamp. Since the server-side execution is bypassing
MutationState, we're skipping that (and for the same reason, you're right, we
can't run it server side when an immutable table has indexes).
There's code in MetaDataClient.buildIndex() that attempts to handle this case
of an UPSERT SELECT having started but not yet completed when a CREATE INDEX is
executed (i.e. the statements are overlapping). The code executes a second pass
to pick up any data table rows that may have been in the process of being
created *before* the index was created (so that command would not know of the
index, hence the incremental maintenance would not have been done). This second
pass is time bounded by 1) the start of the index build minus some "play" until
2) the start of the index build. If the server-side runs the UPSERT SELECT with
the latest time stamp, this second pass won't pick up the rows. This isn't a
perfect solution, but it's the best we could come up with.
I think short term, the easiest fix would be to use
StatementContext.getCurrentTime() to get the time stamp at which the statement
was compiled and pass this through to the server-side. This will fix
ImmutableIndexIT#testCreateIndexDuringUpsertSelect (for mutable and immutable
tables).
Longer term, it'd be good to go through the MutationState API on the
server-side so we can execute an UPSERT SELECT on an immutable table with
indexes. Perhaps we can send over the PTable of the target table from the
client?
For PHOENIX-3583, I think we should give it more thought and target any changes
for 4.11.
> After PHOENIX-3271 higher memory consumption on RS leading to OOM/abort on
> immutable index creation with multiple regions on single RS
> --------------------------------------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-3649
> URL: https://issues.apache.org/jira/browse/PHOENIX-3649
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.9.0
> Reporter: Mujtaba Chohan
> Assignee: Ankit Singhal
> Priority: Blocker
> Fix For: 4.9.1, 4.10.0
>
> Attachments: PHOENIX-3649.patch, PHOENIX-3649_v1.patch
>
>
> *Configuration*
> hbase-0.98.23 standalone
> Heap 5GB
> *When*
> Verified that this happens after PHOENIX-3271 Distribute UPSERT SELECT across
> cluster.
> https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=accd4a276d1085e5d1069caf93798d8f301e4ed6
> To repro
> {noformat}
> CREATE TABLE INDEXED_TABLE (HOST CHAR(2) NOT NULL,DOMAIN VARCHAR NOT NULL,
> FEATURE VARCHAR NOT NULL,DATE DATE NOT NULL,USAGE.CORE BIGINT,USAGE.DB
> BIGINT,STATS.ACTIVE_VISITOR INTEGER CONSTRAINT PK PRIMARY KEY (HOST, DOMAIN,
> FEATURE, DATE)) IMMUTABLE_ROWS=true,MAX_FILESIZE=30485760
> {noformat}
> Upsert 2M rows (CSV is available at https://goo.gl/OsTSKB) that will create
> ~4 regions on a single RS and then create index with data present
> {noformat}
> CREATE INDEX idx5 ON INDEXED_TABLE (CORE) INCLUDE (DB,ACTIVE_VISITOR)
> {noformat}
> From RS log
> {noformat}
> 2017-02-02 13:29:06,899 WARN [rs,51371,1486070044538-HeapMemoryChore]
> regionserver.HeapMemoryManager: heapOccupancyPercent 0.97875696 is above heap
> occupancy alarm watermark (0.95)
> 2017-02-02 13:29:18,198 INFO [SessionTracker] server.ZooKeeperServer:
> Expiring session 0x15a00ad4f300001, timeout of 10000ms exceeded
> 2017-02-02 13:29:18,231 WARN [JvmPauseMonitor] util.JvmPauseMonitor:
> Detected pause in JVM or host machine (eg GC): pause of approximately 10581ms
> GC pool 'ParNew' had collection(s): count=4 time=139ms
> 2017-02-02 13:29:19,669 FATAL [RS:0;rs:51371-EventThread]
> regionserver.HRegionServer: ABORTING region server rs,51371,1486070044538:
> regionserver:51371-0x15a00ad4f300001, quorum=localhost:2181, baseZNode=/hbase
> regionserver:51371-0x15a00ad4f300001 received expired from ZooKeeper, aborting
> {noformat}
> Prior to the change index creation succeeds with as little as 2GB heap.
> [[email protected]]
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)