[
https://issues.apache.org/jira/browse/PHOENIX-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13895002#comment-13895002
]
James Taylor edited comment on PHOENIX-10 at 2/7/14 8:41 PM:
-------------------------------------------------------------
Thanks, [~ram_krish]. You're doing a fantastic job coming up to speed on this.
Appreciate all the contributions.
Would you mind spinning up a pull request on our github mirror
(https://github.com/apache/incubator-phoenix) for this change? We're trying to
follow the same methodology as the apache jcloud guys documented here:
http://wiki.apache.org/jclouds/Committers%20Guide
Essentially, you just need to clone the github mirror, apply your patch to your
cloned version, and then send a pull request back to our github mirror. This
will generate an email to our dev list. Then I can make more detailed comments
on your patch.
For now, here's some quick feedback:
- good point for GROUP BY queries. We don't want to do this for aggregate
queries at all, because they execute on the server side already so there's no
need. In your visitLeave(FunctionParseNode) call, add a SelectStatement member
variable to ProjectionCompiler and add this check:
if (!select.isAggregate() &&
ArrayIndexFunction.NAME.equals(node.getName())) {
- you shouldn't need that instanceof check for if(func instanceof
ArrayIndexFunction) a few lines below this
- on the server-side changes in ScanRegionObserver, create the KeyValueSchema
once in the deserializeArrayPostionalExpressionInfoFromScan and add a member
variable for it. Also, add a member variable for the ValueBitSet created to go
with it and an ImmutableBytesWritable ptr so you don't need to create a new one
each time.
- change you replaceArrayIndexElement only remove the KVs in the loop (and use
remove(idx) as you have the list element position - I don't think there's a
need to allocate a new list of KVs).
- at the end of the loop, evaluate the arrayFuncRefs like this (make
arrayFuncRefs an array instead of a list in
deserializeArrayPostionalExpressionInfoFromScan):
{code}
Tuple tuple = new MultiKeyValueTuple(result);
// Evaluate the array index func references against the current tuple
byte[] value = keyValueSchema.toBytes(tuple, arrayFuncRefs, valueBitSet,
ptr);
result.add(new KeyValue(QueryConstants.ARRAY_DUMMY_ROW,
ScanProjector.VALUE_COLUMN_FAMILY,
ScanProjector.VALUE_COLUMN_QUALIFIER,
value));
{code}
was (Author: jamestaylor):
Thanks, [~ram_krish]. You're doing a fantastic job coming up to speed on this.
Appreciate all the contributions.
Would you mind spinning up a pull request on our github mirror
(https://github.com/apache/incubator-phoenix) for this change? We're trying to
follow the same methodology as the apache jcloud guys documented here:
http://wiki.apache.org/jclouds/Committers%20Guide
Essentially, you just need to clone the github mirror, apply your patch to your
cloned version, and then send a pull request back to our github mirror. This
will generate an email to our dev list. Then I can make more detailed comments
on your patch.
For now, here's some quick feedback:
- good point for GROUP BY queries. We don't want to do this for aggregate
queries at all, because they execute on the server side already so there's no
need. In your visitLeave(FunctionParseNode) call, add a SelectStatement member
variable to ProjectionCompiler and add this check:
if (!select.isAggregate() &&
ArrayIndexFunction.NAME.equals(node.getName())) {
- you shouldn't need that instanceof check for if(func instanceof
ArrayIndexFunction) a few lines below this
- on the server-side changes in ScanRegionObserver, create the KeyValueSchema
once in the deserializeArrayPostionalExpressionInfoFromScan and add a member
variable for it. Also, add a member variable for the ValueBitSet created to go
with it and an ImmutableBytesWritable ptr so you don't need to create a new one
each time.
- change you replaceArrayIndexElement only remove the KVs in the loop (and use
remove(idx) as you have the list element position - I don't think there's a
need to allocate a new list of KVs).
- at the end of the loop, evaluate the arrayFuncRefs like this:
{code}
Tuple tuple = new MultiKeyValueTuple(result);
// Evaluate the array index func references against the current tuple
byte[] value = keyValueSchema.toBytes(tuple, arrayFuncRefs, valueBitSet,
ptr);
result.add(new KeyValue(QueryConstants.ARRAY_DUMMY_ROW,
ScanProjector.VALUE_COLUMN_FAMILY,
ScanProjector.VALUE_COLUMN_QUALIFIER,
value));
{code}
> Push projection of a single ARRAY element to the server
> -------------------------------------------------------
>
> Key: PHOENIX-10
> URL: https://issues.apache.org/jira/browse/PHOENIX-10
> Project: Phoenix
> Issue Type: Improvement
> Reporter: James Taylor
> Assignee: ramkrishna.s.vasudevan
> Attachments: Phoenix-10_1.patch, Phoenix-10_2.patch
>
>
> If only a single array element is selected, we'll still return the entire
> array back to the client. Instead, we should push this to the server and only
> return the single array element. The same goes for the reference to an ARRAY
> in the WHERE clause. There's a general HBase fix for this (i.e. the ability
> to define a separate set of key values that will be returned versus key
> values available to filters) that has a patch here, but is deemed not
> possible to pull into the 0.94 branch by @lhofhansl.
> My thought is that we can add a Filter at the end our our filter chain that
> filters out any KeyValues that aren't in the SELECT expressions (i.e. filter
> out if a column is referenced in the WHERE clause, but not in the SELECT
> expressions). This same Filter could handle returning only the elements of
> the array that are referenced in the SELECT expression rather than the entire
> array.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)