[ https://issues.apache.org/jira/browse/PHOENIX-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538769#comment-14538769 ]
Jesse Yates commented on PHOENIX-1954: -------------------------------------- I think its more that we want to reserve those values. {code} statement specifies NEXT 1000 VALUES FOR seq and executes it 1001 times {code} To me, that would imply that they want 1001*1000 values. bq. is the caller committing to allocate no more than 1000 values? I thinks its more that it is asking for a contiguous allocation of 1000 values, of which it may use some part. Any overrun should be expected to have duplicates outside the range. However, it is possible that they could, interleaved with the reserved batch they just claimed, we asking for the next unique sequence, in which case asking {{NEXT VALUE FOR}} they could be asking for: * the next value in the range * the next unreserved value Depending on how the semantics are defined and we don't necessarily want that, especially if we are in a M/R context if we are integrating the naming the current NEXT VALUE FOR semantics; its not clear which should should take place (and both are valid). Maybe we could do something like the {{NEXT <n> VALUE FOR <seq>}} and then a {{SKIP || RESERVE || ALLOCATE <n> VALUE FOR <seq>}} where: * first has the semantics of allocating a larger batch which you set through {{NEXT VALUE FOR}} give you the next in the range * second skips ahead in the batch and {{NEXT VALUE FOR}} gives you the next unreserved number. > Reserve chunks of numbers for a sequence > ---------------------------------------- > > Key: PHOENIX-1954 > URL: https://issues.apache.org/jira/browse/PHOENIX-1954 > Project: Phoenix > Issue Type: Bug > Reporter: Lars Hofhansl > > In order to be able to generate many ids in bulk (for example in map reduce > jobs) we need a way to generate or reserve large sets of ids. We also need to > mix ids reserved with incrementally generated ids from other clients. > For this we need to atomically increment the sequence and return the value it > had when the increment happened. > If we're OK to throw the current cached set of values away we can do > {{NEXT VALUE FOR <seq>(,<N>)}}, that needs to increment value and return the > value it incremented from (i.e. it has to throw the current cache away, and > return the next value it found at the server). > Or we can invent a new syntax {{RESERVE VALUES FOR <seq>, <N>}} that does the > same, but does not invalidate the cache. > Note that in either case we won't retrieve the reserved set of values via > {{NEXT VALUE FOR}} because we'd need to be idempotent in our case, all we > need to guarantee is that after a call to {{RESERVE VALUES FOR <seq>, <N>}}, > which returns a value <M> is that the range [M, M+N) won't be used by any > other user of the sequence. My might need reserve 1bn ids this way ahead of a > map reduce run. > Any better ideas? -- This message was sent by Atlassian JIRA (v6.3.4#6332)