[ 
https://issues.apache.org/jira/browse/PHOENIX-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892131#comment-15892131
 ] 

Ankit Singhal commented on PHOENIX-3649:
----------------------------------------

Thanks [~giacomotaylor] for explaining it in detail but this will work when new 
upserts which are coming at LATEST_TIMESTAMP during index building phase but 
the case is different with UPSERT SELECT. Consider that UPSERT SELECT started 
at timestamp t0 and writing data at t0 only, and the index is created parallely 
at timestamp t1 (t1>t0). So, as per the current logic of index building, in 
first pass , we build index from 0 to t2(t1+some more seconds) and in second 
pass we build from (t1-building time in the first pass) to t2 which may not 
include t0 timestamp , but still UPSERT SELECT is running which is writing data 
at t0, no mutation for new index will be added on server and there is no run to 
build the new data written at t0. So, in a new patch, I'm building the new 
index at t0 to include the data which fix 
ImmutableIndexIT#testCreateIndexDuringUpsertSelect. let me know if it is fine 
now or we can do anything else.

bq. We set the cell timestamp in MutationState (based on a return of 
MutationState.validate()) so that all of the mutations for an UPSERT SELECT 
have a consistent timestamp. Since the server-side execution is bypassing 
MutationState, we're skipping that (and for the same reason, you're right, we 
can't run it server side when an immutable table has indexes).
Sorry for the confusion, in the last patch too, we were doing upsert at the 
compile time of the statement only. we were using scan max time which is capped 
at the compile time of statement only.



> After PHOENIX-3271 higher memory consumption on RS leading to OOM/abort on 
> immutable index creation with multiple regions on single RS
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-3649
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3649
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.9.0
>            Reporter: Mujtaba Chohan
>            Assignee: Ankit Singhal
>            Priority: Blocker
>             Fix For: 4.9.1, 4.10.0
>
>         Attachments: PHOENIX-3649.patch, PHOENIX-3649_v1.patch
>
>
> *Configuration*
> hbase-0.98.23 standalone
> Heap 5GB
> *When*
> Verified that this happens after PHOENIX-3271 Distribute UPSERT SELECT across 
> cluster. 
> https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=accd4a276d1085e5d1069caf93798d8f301e4ed6
> To repro
> {noformat}
> CREATE TABLE INDEXED_TABLE (HOST CHAR(2) NOT NULL,DOMAIN VARCHAR NOT NULL, 
> FEATURE VARCHAR NOT NULL,DATE DATE NOT NULL,USAGE.CORE BIGINT,USAGE.DB 
> BIGINT,STATS.ACTIVE_VISITOR INTEGER CONSTRAINT PK PRIMARY KEY (HOST, DOMAIN, 
> FEATURE, DATE)) IMMUTABLE_ROWS=true,MAX_FILESIZE=30485760
> {noformat}
> Upsert 2M rows (CSV is available at https://goo.gl/OsTSKB) that will create 
> ~4 regions on a single RS and then create index with data present
> {noformat}
> CREATE INDEX idx5 ON INDEXED_TABLE (CORE) INCLUDE (DB,ACTIVE_VISITOR)
> {noformat}
> From RS log
> {noformat}
> 2017-02-02 13:29:06,899 WARN  [rs,51371,1486070044538-HeapMemoryChore] 
> regionserver.HeapMemoryManager: heapOccupancyPercent 0.97875696 is above heap 
> occupancy alarm watermark (0.95)
> 2017-02-02 13:29:18,198 INFO  [SessionTracker] server.ZooKeeperServer: 
> Expiring session 0x15a00ad4f300001, timeout of 10000ms exceeded
> 2017-02-02 13:29:18,231 WARN  [JvmPauseMonitor] util.JvmPauseMonitor: 
> Detected pause in JVM or host machine (eg GC): pause of approximately 10581ms
> GC pool 'ParNew' had collection(s): count=4 time=139ms
> 2017-02-02 13:29:19,669 FATAL [RS:0;rs:51371-EventThread] 
> regionserver.HRegionServer: ABORTING region server rs,51371,1486070044538: 
> regionserver:51371-0x15a00ad4f300001, quorum=localhost:2181, baseZNode=/hbase 
> regionserver:51371-0x15a00ad4f300001 received expired from ZooKeeper, aborting
> {noformat}
> Prior to the change index creation succeeds with as little as 2GB heap.
> [~an...@apache.org]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to