[
https://issues.apache.org/jira/browse/PHOENIX-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257992#comment-14257992
]
James Taylor commented on PHOENIX-1516:
---------------------------------------
Wanted to ask you too, [~lhofhansl] - do you think it makes sense that we
preserve the same order of calls for reset on the client versus server side? I
didn't really explain this well, but on the server-side, the order (i.e. either
calling the *reset* after a row has been processed or *before* the next row is
processed) is determined by when HBase calls Filter.reset(), as that's our
callback.
For the issue you discovered on UPSERT SELECT, I'd vote on cloning the
RowProjector for each thread processing the SELECT and calling UPSERT so that
we can maintain our optimization. Here's that code that'll go in UpsertCompiler:
{code}
private static class UpsertingParallelIteratorFactory extends
MutatingParallelIteratorFactory {
private RowProjector projector;
private int[] columnIndexes;
private int[] pkSlotIndexes;
private final TableRef tableRef;
private final byte[][] rowExpressions;
private UpsertingParallelIteratorFactory (PhoenixConnection connection,
TableRef tableRef) {
super(connection);
this.tableRef = tableRef;
// Create bytes for serialized version of Expression with which to
clone
this.rowExpressions = new byte[projector.getColumnCount()][];
for (int i = 0; i < rowProjectorBytes.length; i++) {
TrustedByteArrayOutputStream bytesOut = new
TrustedByteArrayOutputStream(1024);
DataOutputStream output = new DataOutputStream(bytesOut);
Expression expression =
projector.getColumnProjector(i).getExpression();
WritableUtils.writeVInt(output,
ExpressionType.valueOf(expression).ordinal());
expression.write(output);
rowExpressions = bytesOut.getBuffer();
}
}
private RowProjector cloneRowProjector() {
List<ColumnProjector> colProjectors =
Lists.newArrayListWithExpectedSize(this.rowExpressions.length);
for (int i = 0; i < rowProjectorBytes.length; i++) {
ByteArrayInputStream bytesIn = new
ByteArrayInputStream(rowExpressions[i]);
DataInputStream input = new DataInputStream(bytesIn);
Expression expression =
ExpressionType.values()[WritableUtils.readVInt(input)].newInstance();
expression.readFields(input);
ColumnProjector colProjector = projector.getColumnProjector(i);
colProjectors.add(new
ExpressionProjector(colProjector.getName(),
colProjector.getTableName(),
expression,
colProjector.isCaseSensitive()));
}
return new RowProjector(colProjectors,
projector.getEstimatedRowByteSize(),
projector.isProjectEmptyKeyValue());
}
@Override
protected MutationState mutate(StatementContext context, ResultIterator
iterator, PhoenixConnection connection) throws SQLException {
PhoenixStatement statement = new PhoenixStatement(connection);
if (context.getSequenceManager().getSequenceCount() > 0) {
throw new IllegalStateException("Cannot pipeline upsert when
sequence is referenced");
}
RowProjector clonedProjector = cloneRowProjector();
return upsertSelect(statement, tableRef, clonedProjector, iterator,
columnIndexes, pkSlotIndexes);
}
{code}
> Add RANDOM built-in function
> ----------------------------
>
> Key: PHOENIX-1516
> URL: https://issues.apache.org/jira/browse/PHOENIX-1516
> Project: Phoenix
> Issue Type: Bug
> Reporter: Lars Hofhansl
> Assignee: Lars Hofhansl
> Attachments: 1516-v2.txt, 1516-v3.txt, 1516.txt
>
>
> I often find it useful to generate some rows with random data.
> Here's a simple RANDOM() function that we could use for that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)