[ https://issues.apache.org/jira/browse/PHOENIX-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892131#comment-15892131 ]
Ankit Singhal commented on PHOENIX-3649: ---------------------------------------- Thanks [~giacomotaylor] for explaining it in detail but this will work when new upserts which are coming at LATEST_TIMESTAMP during index building phase but the case is different with UPSERT SELECT. Consider that UPSERT SELECT started at timestamp t0 and writing data at t0 only, and the index is created parallely at timestamp t1 (t1>t0). So, as per the current logic of index building, in first pass , we build index from 0 to t2(t1+some more seconds) and in second pass we build from (t1-building time in the first pass) to t2 which may not include t0 timestamp , but still UPSERT SELECT is running which is writing data at t0, no mutation for new index will be added on server and there is no run to build the new data written at t0. So, in a new patch, I'm building the new index at t0 to include the data which fix ImmutableIndexIT#testCreateIndexDuringUpsertSelect. let me know if it is fine now or we can do anything else. bq. We set the cell timestamp in MutationState (based on a return of MutationState.validate()) so that all of the mutations for an UPSERT SELECT have a consistent timestamp. Since the server-side execution is bypassing MutationState, we're skipping that (and for the same reason, you're right, we can't run it server side when an immutable table has indexes). Sorry for the confusion, in the last patch too, we were doing upsert at the compile time of the statement only. we were using scan max time which is capped at the compile time of statement only. > After PHOENIX-3271 higher memory consumption on RS leading to OOM/abort on > immutable index creation with multiple regions on single RS > -------------------------------------------------------------------------------------------------------------------------------------- > > Key: PHOENIX-3649 > URL: https://issues.apache.org/jira/browse/PHOENIX-3649 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.9.0 > Reporter: Mujtaba Chohan > Assignee: Ankit Singhal > Priority: Blocker > Fix For: 4.9.1, 4.10.0 > > Attachments: PHOENIX-3649.patch, PHOENIX-3649_v1.patch > > > *Configuration* > hbase-0.98.23 standalone > Heap 5GB > *When* > Verified that this happens after PHOENIX-3271 Distribute UPSERT SELECT across > cluster. > https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=accd4a276d1085e5d1069caf93798d8f301e4ed6 > To repro > {noformat} > CREATE TABLE INDEXED_TABLE (HOST CHAR(2) NOT NULL,DOMAIN VARCHAR NOT NULL, > FEATURE VARCHAR NOT NULL,DATE DATE NOT NULL,USAGE.CORE BIGINT,USAGE.DB > BIGINT,STATS.ACTIVE_VISITOR INTEGER CONSTRAINT PK PRIMARY KEY (HOST, DOMAIN, > FEATURE, DATE)) IMMUTABLE_ROWS=true,MAX_FILESIZE=30485760 > {noformat} > Upsert 2M rows (CSV is available at https://goo.gl/OsTSKB) that will create > ~4 regions on a single RS and then create index with data present > {noformat} > CREATE INDEX idx5 ON INDEXED_TABLE (CORE) INCLUDE (DB,ACTIVE_VISITOR) > {noformat} > From RS log > {noformat} > 2017-02-02 13:29:06,899 WARN [rs,51371,1486070044538-HeapMemoryChore] > regionserver.HeapMemoryManager: heapOccupancyPercent 0.97875696 is above heap > occupancy alarm watermark (0.95) > 2017-02-02 13:29:18,198 INFO [SessionTracker] server.ZooKeeperServer: > Expiring session 0x15a00ad4f300001, timeout of 10000ms exceeded > 2017-02-02 13:29:18,231 WARN [JvmPauseMonitor] util.JvmPauseMonitor: > Detected pause in JVM or host machine (eg GC): pause of approximately 10581ms > GC pool 'ParNew' had collection(s): count=4 time=139ms > 2017-02-02 13:29:19,669 FATAL [RS:0;rs:51371-EventThread] > regionserver.HRegionServer: ABORTING region server rs,51371,1486070044538: > regionserver:51371-0x15a00ad4f300001, quorum=localhost:2181, baseZNode=/hbase > regionserver:51371-0x15a00ad4f300001 received expired from ZooKeeper, aborting > {noformat} > Prior to the change index creation succeeds with as little as 2GB heap. > [~an...@apache.org] -- This message was sent by Atlassian JIRA (v6.3.15#6346)