[ https://issues.apache.org/jira/browse/PHOENIX-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621116#comment-14621116 ]
Jan Fernando commented on PHOENIX-1954: --------------------------------------- [~jamestaylor] Re (1) This is turning out to be a bit more complex that it seems on the surface. The complexity comes from overflows and underflows and cycles for standard sequences. Here's what I have so far, but this doesn't work on sequences that over/underflow with cycles enabled (tests caught this). I'm still looking for a solution for this - the issue stems from the fact that when we overflow we the old check of value.currentValue == value.nextValue would return false when currentValue was at LONG.MAX as nextValue would have overflown, now it returns false. {code} public long incrementValue(long timestamp, ValueOp op, long numToAllocate) throws SQLException { SequenceValue value = findSequenceValue(timestamp); if (value == null) { throw EMPTY_SEQUENCE_CACHE_EXCEPTION; } if (isSequenceCacheExhausted(numToAllocate, value)) { if (op == ValueOp.VALIDATE_SEQUENCE) { return value.currentValue; } throw EMPTY_SEQUENCE_CACHE_EXCEPTION; } return increment(value, op, numToAllocate); } /** * This method checks whether there are sufficient values in the SequenceValue * cached on the client to allocate the requested number of slots. It handles * decreasing and increasing sequence overflows or underflows cycles. */ private boolean isSequenceCacheExhausted(final long numToAllocate, final SequenceValue value) { // Perform a CheckedAdd to make sure if over/underflow // We don't treat this as the cache being exhausted as the current value may be valid in the case // of no cycle, logic in increment() will take care of detecting we've hit the limit of the sequence long targetSequenceValue; try { targetSequenceValue = LongMath.checkedAdd(value.currentValue, numToAllocate * value.incrementBy); } catch (ArithmeticException e) { return false; } if (value.incrementBy > 0) { return targetSequenceValue > value.nextValue; } else { return targetSequenceValue < value.nextValue; } } {code} I am concerned that trying to change from the equality check for non-bulk allocations introduces a lot more complexity and therefore risk of regressions into the code. I think we could still fix the problem of allowing use of the cache for bulk allocations where there are enough values in the cache by only using the new logic for bulk allocations. Since we don't allow cycles this simplifies things: {code} public long incrementValue(long timestamp, ValueOp op, long numToAllocate) throws SQLException { SequenceValue value = findSequenceValue(timestamp); if (value == null) { throw EMPTY_SEQUENCE_CACHE_EXCEPTION; } if (SequenceUtil.isBulkAllocation(numToAllocate) && isSequenceCacheExhausted(numToAllocate, value)) { return unlessValidateSequenceThrowEmptySequenceCacheException(op, value); } else if (isSequenceCacheExhausted(numToAllocate, value)) { return unlessValidateSequenceThrowEmptySequenceCacheException(op, value); } return increment(value, op, numToAllocate); } private long unlessValidateSequenceThrowEmptySequenceCacheException(ValueOp op, SequenceValue value) throws EmptySequenceCacheException { if (op == ValueOp.VALIDATE_SEQUENCE) { return value.currentValue; } throw EMPTY_SEQUENCE_CACHE_EXCEPTION; } /** * This method checks whether there are sufficient values in the SequenceValue * cached on the client to allocate the requested number of slots. It handles * decreasing and increasing sequence overflows or underflows cycles. */ private boolean isSequenceCacheExhausted(final long numToAllocate, final SequenceValue value) { long targetSequenceValue; try { targetSequenceValue = LongMath.checkedAdd(value.currentValue, numToAllocate * value.incrementBy); } catch (ArithmeticException e) { // Perform a CheckedAdd to make sure if over/underflow // We don't treat this as the cache being exhausted as the current value may be valid in the case // of no cycle, logic in increment() will take care of detecting we've hit the limit of the sequence return false; } if (value.incrementBy > 0) { return targetSequenceValue > value.nextValue; } else { return targetSequenceValue < value.nextValue; } } {code} I'm still digging into the cycle and overflow issue, but as I said I am concerned about risk by introducing this complexity to the flow for NEXT VALUE FOR. Thoughts? > Reserve chunks of numbers for a sequence > ---------------------------------------- > > Key: PHOENIX-1954 > URL: https://issues.apache.org/jira/browse/PHOENIX-1954 > Project: Phoenix > Issue Type: New Feature > Reporter: Lars Hofhansl > Assignee: Jan Fernando > Attachments: PHOENIX-1954-rebased.patch, PHOENIX-1954-wip.patch, > PHOENIX-1954-wip2.patch.txt, PHOENIX-1954-wip3.patch, > PHOENIX-1954-wip4.patch, PHOENIX-1954-wip5-rebased.patch > > > In order to be able to generate many ids in bulk (for example in map reduce > jobs) we need a way to generate or reserve large sets of ids. We also need to > mix ids reserved with incrementally generated ids from other clients. > For this we need to atomically increment the sequence and return the value it > had when the increment happened. > If we're OK to throw the current cached set of values away we can do > {{NEXT VALUE FOR <seq>(,<N>)}}, that needs to increment value and return the > value it incremented from (i.e. it has to throw the current cache away, and > return the next value it found at the server). > Or we can invent a new syntax {{RESERVE VALUES FOR <seq>, <N>}} that does the > same, but does not invalidate the cache. > Note that in either case we won't retrieve the reserved set of values via > {{NEXT VALUE FOR}} because we'd need to be idempotent in our case, all we > need to guarantee is that after a call to {{RESERVE VALUES FOR <seq>, <N>}}, > which returns a value <M> is that the range [M, M+N) won't be used by any > other user of the sequence. My might need reserve 1bn ids this way ahead of a > map reduce run. > Any better ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)