[ https://issues.apache.org/jira/browse/PHOENIX-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14614494#comment-14614494 ]
Jan Fernando commented on PHOENIX-1954: --------------------------------------- In the following method in Sequence.java: {code} public long incrementValue(long timestamp, ValueOp op, long numToAllocate) {code} I added the following code: {code} if (SequenceUtil.isBulkAllocation(numToAllocate)) { if (op == ValueOp.INCREMENT_SEQUENCE) { // On calling rs.next() for NEXT <n> VALUES FOR <seq>, we need to calc value to return // as currentValue was not adjusted for Bulk Allocation Flow, since we essentially reserve // and use them at the same time return value.currentValue - (numToAllocate * value.incrementBy); } else { if (op == ValueOp.RESERVE_SEQUENCE) { throw EMPTY_SEQUENCE_CACHE_EXCEPTION; } else { return value.currentValue; } } {code} This handles returning the correct value back from the expression NEXT <n> VALUES FOR SEQUENCE <seq> when rs.next() is called after we bulk allocate a set number of slots. I was thinking about this a bit and I wonder whether there is a race condition here. When other clients get values via NEXT VALUE FOR on the same sequence concurrently can they move currentValue forward and cause the wrong value to be returned? I'll look some more deeply tomorrow but curious if you guys have any thoughts on whether this is the case [~jamestaylor] [~tdsilva]? > Reserve chunks of numbers for a sequence > ---------------------------------------- > > Key: PHOENIX-1954 > URL: https://issues.apache.org/jira/browse/PHOENIX-1954 > Project: Phoenix > Issue Type: New Feature > Reporter: Lars Hofhansl > Assignee: Jan Fernando > Attachments: PHOENIX-1954-wip.patch, PHOENIX-1954-wip2.patch.txt, > PHOENIX-1954-wip3.patch > > > In order to be able to generate many ids in bulk (for example in map reduce > jobs) we need a way to generate or reserve large sets of ids. We also need to > mix ids reserved with incrementally generated ids from other clients. > For this we need to atomically increment the sequence and return the value it > had when the increment happened. > If we're OK to throw the current cached set of values away we can do > {{NEXT VALUE FOR <seq>(,<N>)}}, that needs to increment value and return the > value it incremented from (i.e. it has to throw the current cache away, and > return the next value it found at the server). > Or we can invent a new syntax {{RESERVE VALUES FOR <seq>, <N>}} that does the > same, but does not invalidate the cache. > Note that in either case we won't retrieve the reserved set of values via > {{NEXT VALUE FOR}} because we'd need to be idempotent in our case, all we > need to guarantee is that after a call to {{RESERVE VALUES FOR <seq>, <N>}}, > which returns a value <M> is that the range [M, M+N) won't be used by any > other user of the sequence. My might need reserve 1bn ids this way ahead of a > map reduce run. > Any better ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)