Github user JamesRTaylor commented on a diff in the pull request:
https://github.com/apache/incubator-phoenix/pull/8#discussion_r9957239
--- Diff:
phoenix-core/src/main/java/org/apache/phoenix/expression/ArrayConstructorExpression.java
---
@@ -62,27 +63,54 @@ public void reset() {
position = 0;
Arrays.fill(elements, null);
}
-
+
@Override
public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {
- for (int i = position >= 0 ? position : 0; i < elements.length;
i++) {
- Expression child = children.get(i);
- if (!child.evaluate(tuple, ptr)) {
- if (tuple != null && !tuple.isImmutable()) {
- if (position >= 0) position = i;
- return false;
+ try {
+ int offset = 0;
+ // track the elementlength for variable array
+ int noOfElements = children.size();
+ int elementLength = 0;
+ byteStream = new TrustedByteArrayOutputStream(estimatedSize);
+ oStream = new DataOutputStream(byteStream);
+ for (int i = position >= 0 ? position : 0; i <
elements.length; i++) {
+ Expression child = children.get(i);
+ if (!child.evaluate(tuple, ptr)) {
+ if (tuple != null && !tuple.isImmutable()) {
+ if (position >= 0) position = i;
+ return false;
+ }
+ } else {
+ // track the offset position here from the size of the
byteStream
+ if (!baseType.isFixedWidth()) {
+ offset = byteStream.size();
+ offsetPos[i] = offset;
--- End diff --
When you come out of the loop, just subtract nNulls from i, so you'd have
the same array 0 4 4 10 10 16, but you'd pass through how many values are
in the array to your serialization function (instead of assuming that the
entire array of positions is being used).
We can't store trailing nulls, it'll mess things up.
On Fri, Feb 21, 2014 at 10:54 AM, ramkrish86
<[email protected]>wrote:
> In
>
phoenix-core/src/main/java/org/apache/phoenix/expression/ArrayConstructorExpression.java:
>
> > + int noOfElements = children.size();
> > + int elementLength = 0;
> > + byteStream = new
TrustedByteArrayOutputStream(estimatedSize);
> > + oStream = new DataOutputStream(byteStream);
> > + for (int i = position >= 0 ? position : 0; i <
elements.length; i++) {
> > + Expression child = children.get(i);
> > + if (!child.evaluate(tuple, ptr)) {
> > + if (tuple != null && !tuple.isImmutable()) {
> > + if (position >= 0) position = i;
> > + return false;
> > + }
> > + } else {
> > + // track the offset position here from the size of
the byteStream
> > + if (!baseType.isFixedWidth()) {
> > + offset = byteStream.size();
> > + offsetPos[i] = offset;
>
> Take this case
> abc, null, bcd, null, null, b
> The offset for this would be as per the above logic where we get the
> offset both for nulls and non nulls
> 0 4 4 10 10 10
> Now while deserialization i know there are 6 elements and always we need
> to compare successive two elements to know the length
> For the first element it would do 4 - 0 = 4
> The next element would mean null 4 -4 = 0 (fine no problem)
> For the next element it is now 10 -4 = 6 (but this is wrong) because we
> already have the seperator byte added and the null value counter along
with
> a seperator. So we need to create a logic to manipulate this. I have done
> that. Not a problem. So i would track nulls here and add two bytes to the
> currOff while reading the next element.
> The same happens for the last element, because we cannot compare last with
> anyother element we would know its offset and if there was a null
> previously then we need to adjust its offset to read the exact element.
> Now take a case where there are trailing nulls
>
> abc, null, bcd, null, ced, null. The offset array would be
> 0 4 4 10 10 16
> In this case
> For the first element 4 - 0 = 0
> second element = 4 - 4 = 0 (so null)
> 3rd element = 10 - 4 = 6 (but i have applied logic to skip the seperator
> byte) so i am able to read this
> 4th element = 10 -10 = 0
> 5th element = 16 - 10 = 6 (adjust offset)
> 6th element is actually a null. But how do i know that? Because we have
> only the last elements offset in hand with us. And using that we cannot
> infer the presence of null. Am i missing something here.
>
> I tried changing the logic of how we add the offset but that is again not
> easy in deserialization. That is why I thought better to write the number
> of trailing nulls.
>
> The logic to deal with byte buffer should actually know how many nulls are
> there, in that how many are repeating for us to fix the exact bytebuffer
> size. That again needs some tweak, but i have used this logic to find out
> the trailing nulls.
>
> --
> Reply to this email directly or view it on
GitHub<https://github.com/apache/incubator-phoenix/pull/8/files#r9956836>
> .
>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your response.
If your project does not have this feature enabled and wishes so, or if the
feature is enabled but not working, please contact infrastructure at
[email protected] or file a JIRA ticket with INFRA.
---