[
https://issues.apache.org/jira/browse/PHOENIX-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892131#comment-15892131
]
Ankit Singhal commented on PHOENIX-3649:
----------------------------------------
Thanks [~giacomotaylor] for explaining it in detail but this will work when new
upserts which are coming at LATEST_TIMESTAMP during index building phase but
the case is different with UPSERT SELECT. Consider that UPSERT SELECT started
at timestamp t0 and writing data at t0 only, and the index is created parallely
at timestamp t1 (t1>t0). So, as per the current logic of index building, in
first pass , we build index from 0 to t2(t1+some more seconds) and in second
pass we build from (t1-building time in the first pass) to t2 which may not
include t0 timestamp , but still UPSERT SELECT is running which is writing data
at t0, no mutation for new index will be added on server and there is no run to
build the new data written at t0. So, in a new patch, I'm building the new
index at t0 to include the data which fix
ImmutableIndexIT#testCreateIndexDuringUpsertSelect. let me know if it is fine
now or we can do anything else.
bq. We set the cell timestamp in MutationState (based on a return of
MutationState.validate()) so that all of the mutations for an UPSERT SELECT
have a consistent timestamp. Since the server-side execution is bypassing
MutationState, we're skipping that (and for the same reason, you're right, we
can't run it server side when an immutable table has indexes).
Sorry for the confusion, in the last patch too, we were doing upsert at the
compile time of the statement only. we were using scan max time which is capped
at the compile time of statement only.
> After PHOENIX-3271 higher memory consumption on RS leading to OOM/abort on
> immutable index creation with multiple regions on single RS
> --------------------------------------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-3649
> URL: https://issues.apache.org/jira/browse/PHOENIX-3649
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.9.0
> Reporter: Mujtaba Chohan
> Assignee: Ankit Singhal
> Priority: Blocker
> Fix For: 4.9.1, 4.10.0
>
> Attachments: PHOENIX-3649.patch, PHOENIX-3649_v1.patch
>
>
> *Configuration*
> hbase-0.98.23 standalone
> Heap 5GB
> *When*
> Verified that this happens after PHOENIX-3271 Distribute UPSERT SELECT across
> cluster.
> https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=accd4a276d1085e5d1069caf93798d8f301e4ed6
> To repro
> {noformat}
> CREATE TABLE INDEXED_TABLE (HOST CHAR(2) NOT NULL,DOMAIN VARCHAR NOT NULL,
> FEATURE VARCHAR NOT NULL,DATE DATE NOT NULL,USAGE.CORE BIGINT,USAGE.DB
> BIGINT,STATS.ACTIVE_VISITOR INTEGER CONSTRAINT PK PRIMARY KEY (HOST, DOMAIN,
> FEATURE, DATE)) IMMUTABLE_ROWS=true,MAX_FILESIZE=30485760
> {noformat}
> Upsert 2M rows (CSV is available at https://goo.gl/OsTSKB) that will create
> ~4 regions on a single RS and then create index with data present
> {noformat}
> CREATE INDEX idx5 ON INDEXED_TABLE (CORE) INCLUDE (DB,ACTIVE_VISITOR)
> {noformat}
> From RS log
> {noformat}
> 2017-02-02 13:29:06,899 WARN [rs,51371,1486070044538-HeapMemoryChore]
> regionserver.HeapMemoryManager: heapOccupancyPercent 0.97875696 is above heap
> occupancy alarm watermark (0.95)
> 2017-02-02 13:29:18,198 INFO [SessionTracker] server.ZooKeeperServer:
> Expiring session 0x15a00ad4f300001, timeout of 10000ms exceeded
> 2017-02-02 13:29:18,231 WARN [JvmPauseMonitor] util.JvmPauseMonitor:
> Detected pause in JVM or host machine (eg GC): pause of approximately 10581ms
> GC pool 'ParNew' had collection(s): count=4 time=139ms
> 2017-02-02 13:29:19,669 FATAL [RS:0;rs:51371-EventThread]
> regionserver.HRegionServer: ABORTING region server rs,51371,1486070044538:
> regionserver:51371-0x15a00ad4f300001, quorum=localhost:2181, baseZNode=/hbase
> regionserver:51371-0x15a00ad4f300001 received expired from ZooKeeper, aborting
> {noformat}
> Prior to the change index creation succeeds with as little as 2GB heap.
> [[email protected]]
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)