[
https://issues.apache.org/jira/browse/DRILL-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Rogers updated DRILL-5830:
-------------------------------
Description:
DRILL-5546 added a number of fixes for empty batches. One part of the fix was
for HBase. Key changes:
* Add code to expand wildcards in the planner. (i.e. SELECT *)
* Remove support for wildcards in the HBase record reader.
As noted in DRILL-5775, this change had the effect of breaking support for
MapR-DB binary (which is API compatible with HBase.) DRILL-5775 does this by
expanding wildcards in the planner for MapR DB as was done for HBase in
DRILL-5546.
Unfortunately, this change introduced other regressions into the code as
described by DRILL-5706.
Investigation of those issues revealed that we should back out the original
DRILL-5546 changes and go down a different route.
As it turns out, HBase already had a project push-down rule that expanded
wildcards. However, that rule didn't work correctly some of the time.
DRILL-5546 fixed that bug, ensuring that wildcards are expanded (at least in
the cases tested for this ticket.)
The actual issue turned out to be a bug in the {{RecordBatchLoader}} class
which did not consider map contents when detecting schema change. As a result,
results like (row_key, cf\{}) were treated the same as (row_key, cf\{mycol})
and the actual data colums were discarded, but randomly depending on batch
arrival order.
was:
DRILL-5546 added a number of fixes for empty batches. One part of the fix was
for HBase. Key changes:
* Add code to expand wildcards in the planner. (i.e. SELECT *)
* Remove support for wildcards in the HBase record reader.
As noted in DRILL-5775, this change had the effect of breaking support for
MapR-DB binary (which is API compatible with HBase.) DRILL-5775 does this by
expanding wildcards in the planner for MapR DB as was done for HBase in
DRILL-5546.
Unfortunately, this change introduced other regressions into the code as
described by DRILL-5706.
Investigation of those issues revealed that we should back out the original
DRILL-5546 changes and go down a different route.
> Resolve regressions to MapR DB from DRILL-5546
> ----------------------------------------------
>
> Key: DRILL-5830
> URL: https://issues.apache.org/jira/browse/DRILL-5830
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.12.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Fix For: 1.12.0
>
>
> DRILL-5546 added a number of fixes for empty batches. One part of the fix was
> for HBase. Key changes:
> * Add code to expand wildcards in the planner. (i.e. SELECT *)
> * Remove support for wildcards in the HBase record reader.
> As noted in DRILL-5775, this change had the effect of breaking support for
> MapR-DB binary (which is API compatible with HBase.) DRILL-5775 does this by
> expanding wildcards in the planner for MapR DB as was done for HBase in
> DRILL-5546.
> Unfortunately, this change introduced other regressions into the code as
> described by DRILL-5706.
> Investigation of those issues revealed that we should back out the original
> DRILL-5546 changes and go down a different route.
> As it turns out, HBase already had a project push-down rule that expanded
> wildcards. However, that rule didn't work correctly some of the time.
> DRILL-5546 fixed that bug, ensuring that wildcards are expanded (at least in
> the cases tested for this ticket.)
> The actual issue turned out to be a bug in the {{RecordBatchLoader}} class
> which did not consider map contents when detecting schema change. As a
> result, results like (row_key, cf\{}) were treated the same as (row_key,
> cf\{mycol}) and the actual data colums were discarded, but randomly depending
> on batch arrival order.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)