[
https://issues.apache.org/jira/browse/PHOENIX-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265085#comment-14265085
]
Maryann Xue commented on PHOENIX-1570:
--------------------------------------
Thanks again [~bdifn] for your analysis, which was mostly very correct. I'll
just try to re-state the problem, to hopefully make it even clearer.
Everything else worked perfectly right except for the final expression
evaluation in column projectors. Think the root cause lies in the
ExpressionCompiler for LocalIndexDataColumnRef, which compiles column reference
into ProjectedColumnExpression. The ProjectedColumnExpression object holds an
KeyValueSchema object that evolves as the compilation goes, which means the
schema is different when compiling "a" and "s". The "columns" passed to
ProjectedColumnExpression by LocalIndexDataColumnRef is not "final" but just a
"snapshot" (LocalIndexDataColumnRef.java:70).
Luckily though, in a lot of cases this won't produce wrong results. But in
[~bdifn]'s case, the field count of the schema just crossed the boundary
between non-variable-length and variable-length. The interpretation for earlier
columns (from "a" to "r") still took the schema as a non-variable-length one
and thus got the wrong bit information.
I think the fix may not be a complete solution for the problem. But on the
other hand, it might require adding a whole new pass on the AST before
expression compilation (incl. group-by, order-by as well as select) in order to
get a final schema for local index. I can think of a walk-around though, which
is to remove "non-variable-length" optimization from ValueBitSet and make
everything "variable-length". What do you think, [~jamestaylor]?
> Data missing when using local index
> -----------------------------------
>
> Key: PHOENIX-1570
> URL: https://issues.apache.org/jira/browse/PHOENIX-1570
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.2.1, 4.2.2
> Environment: ubuntu
> HBase 0.98.7
> Hadoop 2.5.1
> OS: ubuntu
> Reporter: wuchengzhi
> Priority: Critical
>
> 1. crate a table by the schema as below:
> CREATE TABLE IF NOT EXISTS Miss_data_table(
> a BIGINT NOT NULL,
> b VARCHAR,
> c INTEGER,
> d INTEGER,
> e INTEGER,
> f INTEGER,
> g VARCHAR,
> h VARCHAR,
> i INTEGER,
> j VARCHAR,
> k INTEGER,
> l VARCHAR,
> m VARCHAR,
> n INTEGER,
> o INTEGER,
> p VARCHAR,
> q VARCHAR,
> r INTEGER,
> s BIGINT,
> t VARCHAR CONSTRAINT pk PRIMARY KEY(a))
> 2.create local index for the table with column: q
> create local index idx_q on Miss_data_table (q);
> 3.upsert data into table.
> upsert into Miss_data_table
> values(96660688,'hello/TEST-0',156,-1,-1,0,'2013-02-14
> 18:34:05.0','TEST-1',0,'495839182',0,'50','',0,0,'1818378','102218',0,26,'20141201')
> 4. execute querys...
> select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t from Miss_data_table where q =
> '102218';
> +----------+--------------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+--------+------+------+----------+
> | A | B | C | D | E | F | G | H | I |
> J | K | L | M | N | O | P | Q | R | S | T
> |
> +----------+--------------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+--------+------+------+----------+
> | 96660688 | hello/TEST-0 | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
> NULL | NULL | NULL | NULL | NULL | NULL | NULL | 102218 | NULL | 26 |
> 20141201 |
> +----------+--------------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+--------+------+------+----------+
> select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t from Miss_data_table where
> a=96660688;
> +----------+--------------+------+------+------+------+-----------------------+--------+------+-----------+------+------+------+------+------+---------+--------+------+------+----------+
> | A | B | C | D | E | F | G
> | H | I | J | K | L | M | N | O | P | Q
> | R | S | T |
> +----------+--------------+------+------+------+------+-----------------------+--------+------+-----------+------+------+------+------+------+---------+--------+------+------+----------+
> | 96660688 | hello/TEST-0 | 156 | -1 | -1 | 0 | 2013-02-14 18:34:05.0
> | TEST-1 | 0 | 495839182 | 0 | 50 | NULL | 0 | 0 | 1818378 |
> 102218 | 0 | 26 | 20141201 |
> +----------+--------------+------+------+------+------+-----------------------+--------+------+-----------+------+------+------+------+------+---------+--------+------+------+----------+
> // execute the query plain ,it shows we fetch data by local index.
> explain select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t from Miss_data_table
> where q = '102218';
> +------------------------------------------+
> | PLAN |
> +------------------------------------------+
> | CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER
> _LOCAL_IDX_TEST.MISS_DATA_TABLE [-32768,'102218'] |
> | CLIENT MERGE SORT |
> +------------------------------------------+
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)