[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-12-03 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r353027079
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/column_reader.cpp
 ##
 @@ -536,17 +518,17 @@ Status DefaultValueColumnIterator::init(const 
ColumnIteratorOptions& opts) {
 return Status::OK();
 }
 
-Status DefaultValueColumnIterator::next_batch(size_t* n, ColumnBlock* dst) {
+Status DefaultValueColumnIterator::next_batch(size_t* n, ColumnBlockView* dst) 
{
+if (dst->is_nullable()) {
+dst->set_null_bits(*n, _is_default_value_null);
+}
+
 if (_is_default_value_null) {
-for (int i = 0; i < *n; ++i) {
-dst->set_is_null(i, true);
-}
+dst->advance(*n);
 } else {
 for (int i = 0; i < *n; ++i) {
-memcpy(dst->mutable_cell_ptr(i), _mem_value.data(), _value_size);
-if (dst->is_nullable()) {
-dst->set_is_null(i, false);
-}
+memcpy(dst->data(), _mem_value.data(), _value_size);
+dst->advance(1);
 
 Review comment:
   OK


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-12-02 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352645052
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/bitmap_index_reader.cpp
 ##
 @@ -17,10 +17,61 @@
 
 #include "olap/rowset/segment_v2/bitmap_index_reader.h"
 
+#include "olap/types.h"
+
 namespace doris {
 namespace segment_v2 {
 
+Status BitmapIndexReader::load() {
+const IndexedColumnMetaPB& dict_meta = _bitmap_index_meta.dict_column();
+const IndexedColumnMetaPB& bitmap_meta = 
_bitmap_index_meta.bitmap_column();
+_has_null = _bitmap_index_meta.has_null();
+
+_dict_column_reader.reset(new IndexedColumnReader(_file, dict_meta));
+_bitmap_column_reader.reset(new IndexedColumnReader(_file, bitmap_meta));
+RETURN_IF_ERROR(_dict_column_reader->load());
+RETURN_IF_ERROR(_bitmap_column_reader->load());
+return Status::OK();
+}
+
+Status BitmapIndexReader::new_iterator(BitmapIndexIterator** iterator) {
+*iterator = new BitmapIndexIterator(this);
+return Status::OK();
+}
+
+Status BitmapIndexIterator::seek_dictionary(const void* value, bool* 
exact_match) {
+RETURN_IF_ERROR(_dict_column_iter.seek_at_or_after(value, exact_match));
+_current_rowid = _dict_column_iter.get_current_ordinal();
+return Status::OK();
+}
+
+Status BitmapIndexIterator::read_bitmap(rowid_t ordinal, Roaring* result) {
+DCHECK(0 <= ordinal && ordinal < _reader->bitmap_nums());
+
+Slice value;
+uint8_t nullmap;
+size_t num_to_read = 1;
+ColumnBlock block(_reader->type_info(), (uint8_t*) , , 
num_to_read, _pool.get());
 
 Review comment:
   should call _pool.clear() to avoid too much memory footprint


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-12-02 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352652541
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/column_reader.cpp
 ##
 @@ -278,6 +250,17 @@ Status ColumnReader::_load_zone_map_index() {
 return Status::OK();
 }
 
+Status ColumnReader::_load_bitmap_index() {
+if (_meta.has_bitmap_index()) {
+const BitmapIndexColumnPB bitmap_index_meta = _meta.bitmap_index();
 
 Review comment:
   ```suggestion
   const BitmapIndexColumnPB& bitmap_index_meta = _meta.bitmap_index();
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-12-02 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352637982
 
 

 ##
 File path: be/src/olap/comparison_predicate.cpp
 ##
 @@ -138,6 +140,84 @@ COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(LessEqualPredicate, 
<=)
 COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(GreaterPredicate, >)
 COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(GreaterEqualPredicate, >=)
 
+#define BITMAP_COMPARE_EqualPredicate(s, exact_match, seeked_ordinal, 
iterator, bitmap, roaring) \
+if (!s.is_not_found()) { \
+if (!s.ok()) { return s; } \
+if (exact_match) { \
+RETURN_IF_ERROR(iterator->read_bitmap(seeked_ordinal, 
)); \
+} \
+} \
 
 Review comment:
   better to remove last \


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-12-02 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352639264
 
 

 ##
 File path: be/src/olap/in_list_predicate.cpp
 ##
 @@ -109,6 +109,39 @@ IN_LIST_PRED_EVALUATE(NotInListPredicate, ==)
 IN_LIST_PRED_COLUMN_BLOCK_EVALUATE(InListPredicate, !=)
 IN_LIST_PRED_COLUMN_BLOCK_EVALUATE(NotInListPredicate, ==)
 
+#define IN_LIST_PRED_BITMAP_EVALUATE(CLASS, OP) \
+template \
+Status CLASS::evaluate(const Schema& schema, const 
vector& iterators, uint32_t num_rows, Roaring* bitmap) 
const { \
+BitmapIndexIterator *iterator = iterators[_column_id]; \
+if (iterator == nullptr) { \
+return Status::OK(); \
+} \
+if (iterator->has_null_bitmap()) { \
+Roaring null_bitmap; \
+RETURN_IF_ERROR(iterator->read_null_bitmap(_bitmap)); \
+*bitmap -= null_bitmap; \
+} \
+Roaring roaring; \
+for (auto value:_values) { \
+bool exact_match; \
+Status s = iterator->seek_dictionary(, _match); \
+rowid_t seeked_ordinal = iterator->current_ordinal(); \
+if (!s.is_not_found()) { \
+if (!s.ok()) { return s; } \
+if (exact_match) { \
+Roaring bitmap; \
 
 Review comment:
   rename it, it was duplicate with input param.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-12-02 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352654301
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/column_reader.cpp
 ##
 @@ -536,17 +518,17 @@ Status DefaultValueColumnIterator::init(const 
ColumnIteratorOptions& opts) {
 return Status::OK();
 }
 
-Status DefaultValueColumnIterator::next_batch(size_t* n, ColumnBlock* dst) {
+Status DefaultValueColumnIterator::next_batch(size_t* n, ColumnBlockView* dst) 
{
+if (dst->is_nullable()) {
+dst->set_null_bits(*n, _is_default_value_null);
+}
+
 if (_is_default_value_null) {
-for (int i = 0; i < *n; ++i) {
-dst->set_is_null(i, true);
-}
+dst->advance(*n);
 } else {
 for (int i = 0; i < *n; ++i) {
-memcpy(dst->mutable_cell_ptr(i), _mem_value.data(), _value_size);
-if (dst->is_nullable()) {
-dst->set_is_null(i, false);
-}
+memcpy(dst->data(), _mem_value.data(), _value_size);
+dst->advance(1);
 
 Review comment:
   why 1?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-12-02 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352658221
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/index_page.h
 ##
 @@ -78,64 +78,80 @@ class IndexPageBuilder {
 // is the builder currently between finish() and reset()?
 bool _finished = false;
 faststring _buffer;
-std::vector _entry_offsets;
+uint32_t _count = 0;
 
 Review comment:
   should update comment to keep consistent.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-12-02 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352652744
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/column_reader.cpp
 ##
 @@ -278,6 +250,17 @@ Status ColumnReader::_load_zone_map_index() {
 return Status::OK();
 }
 
+Status ColumnReader::_load_bitmap_index() {
+if (_meta.has_bitmap_index()) {
+const BitmapIndexColumnPB bitmap_index_meta = _meta.bitmap_index();
+_bitmap_index_reader.reset(new 
BitmapIndexReader(_file,bitmap_index_meta));
 
 Review comment:
   ```suggestion
   _bitmap_index_reader.reset(new BitmapIndexReader(_file, 
bitmap_index_meta));
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-12-02 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352637846
 
 

 ##
 File path: be/src/olap/comparison_predicate.cpp
 ##
 @@ -138,6 +140,84 @@ COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(LessEqualPredicate, 
<=)
 COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(GreaterPredicate, >)
 COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(GreaterEqualPredicate, >=)
 
+#define BITMAP_COMPARE_EqualPredicate(s, exact_match, seeked_ordinal, 
iterator, bitmap, roaring) \
+if (!s.is_not_found()) { \
 
 Review comment:
   better to use do { } while (0)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-12-02 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352625665
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/bitmap_index_reader.cpp
 ##
 @@ -17,10 +17,65 @@
 
 #include "olap/rowset/segment_v2/bitmap_index_reader.h"
 
+#include "olap/types.h"
+#include "runtime/mem_tracker.h"
+#include "runtime/mem_pool.h"
+
 namespace doris {
 namespace segment_v2 {
 
+Status BitmapIndexReader::load() {
+const IndexedColumnMetaPB dict_meta = _bitmap_index_meta.dict_column();
+const IndexedColumnMetaPB bitmap_meta = _bitmap_index_meta.bitmap_column();
+_has_null = _bitmap_index_meta.has_null();
+
+_dict_column_reader.reset(new IndexedColumnReader(_file, dict_meta));
+_bitmap_column_reader.reset(new IndexedColumnReader(_file, bitmap_meta));
+RETURN_IF_ERROR(_dict_column_reader->load());
+RETURN_IF_ERROR(_bitmap_column_reader->load());
+return Status::OK();
+}
+
+Status BitmapIndexReader::new_iterator(BitmapIndexIterator** iterator) {
+*iterator = new BitmapIndexIterator(this);
+return Status::OK();
+}
+
+Status BitmapIndexIterator::seek_dictionary(const void* value, bool* 
exact_match) {
+RETURN_IF_ERROR(_dict_column_iter.seek_at_or_after(value, exact_match));
+_current_rowid = _dict_column_iter.get_current_ordinal();
+return Status::OK();
+}
+
+Status BitmapIndexIterator::read_bitmap(rowid_t ordinal, Roaring* result) {
+DCHECK(0 <= ordinal && ordinal < _reader->bitmap_nums());
+
+Slice value;
+uint8_t nullmap;
+MemTracker mem_tracker;
+MemPool mem_pool(_tracker);
+size_t num_to_read = 1;
+ColumnBlock block(get_type_info(OLAP_FIELD_TYPE_VARCHAR), (uint8_t*) 
, , num_to_read, _pool);
+ColumnBlockView column_block_view();
+
+RETURN_IF_ERROR(_bitmap_column_iter.seek_to_ordinal(ordinal));
+size_t num_read = num_to_read;
+RETURN_IF_ERROR(_bitmap_column_iter.next_batch(_read, 
_block_view));
+DCHECK(num_to_read == num_read);
+*result = Roaring::read(value.data, false);
+return Status::OK();
+}
+
+Status BitmapIndexIterator::read_union_bitmap(rowid_t from, rowid_t to, 
Roaring* result) {
+DCHECK(0 <= from && from <= to && to <= _reader->bitmap_nums());
 
+for (rowid_t pos = from; pos < to; pos++) {
+Roaring bitmap;
+RETURN_IF_ERROR(read_bitmap(pos, ));
 
 Review comment:
   ok


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352057588
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/indexed_column_reader.cpp
 ##
 @@ -17,55 +17,254 @@
 
 #include "olap/rowset/segment_v2/indexed_column_reader.h"
 
+#include "env/env.h" // for RandomAccessFile
+#include "gutil/strings/substitute.h" // for Substitute
+#include "olap/key_coder.h"
+#include "olap/rowset/segment_v2/encoding_info.h" // for EncodingInfo
+#include "olap/rowset/segment_v2/index_page.h" // for IndexPageReader
+#include "olap/rowset/segment_v2/options.h" // for PageDecoderOptions
+#include "olap/rowset/segment_v2/page_compression.h"
+#include "olap/rowset/segment_v2/page_decoder.h" // for PagePointer
+#include "util/crc32c.h"
+#include "util/rle_encoding.h" // for RleDecoder
+
 namespace doris {
 namespace segment_v2 {
 
-IndexedColumnReader::IndexedColumnReader(RandomAccessFile* file) {
-// TODO
-}
-Status IndexedColumnReader::init() {
-return Status(); // TODO
-}
-Status 
IndexedColumnReader::new_iterator(std::unique_ptr* iter) 
{
-return Status(); // TODO
-}
-Status IndexedColumnReader::read_page(const PagePointer& pp, PageHandle* ret) {
-return Status(); // TODO
-}
-const IndexedColumnMetaPB& IndexedColumnReader::meta() const {
-static IndexedColumnMetaPB temp;
-return temp; // TODO
-}
-bool IndexedColumnReader::has_ordinal_index() const {
-return false; // TODO
+using strings::Substitute;
+
+Status IndexedColumnReader::load() {
+_type_info = get_type_info((FieldType)_meta.data_type());
+if (_type_info == nullptr) {
+return Status::NotSupported(Substitute("unsupported typeinfo, 
type=$0", _meta.data_type()));
+}
+RETURN_IF_ERROR(EncodingInfo::get(_type_info, _meta.encoding(), 
&_encoding_info));
+RETURN_IF_ERROR(get_block_compression_codec(_meta.compression(), 
&_compress_codec));
+_validx_key_coder = get_key_coder(_type_info->type());
+
+// read and parse ordinal index page when exists
+if (_meta.has_ordinal_index_meta()) {
+if (_meta.ordinal_index_meta().is_root_data_page()) {
+_sole_data_page = 
PagePointer(_meta.ordinal_index_meta().root_page());
+} else {
+RETURN_IF_ERROR(read_page(_meta.ordinal_index_meta().root_page(), 
&_ordinal_index_page_handle));
+
RETURN_IF_ERROR(_ordinal_index_reader.parse(_ordinal_index_page_handle.data()));
+_has_index_page = true;
+}
+}
+
+// read and parse value index page when exists
+if (_meta.has_value_index_meta()) {
+if (_meta.value_index_meta().is_root_data_page()) {
+_sole_data_page = 
PagePointer(_meta.value_index_meta().root_page());
+} else {
+RETURN_IF_ERROR(read_page(_meta.value_index_meta().root_page(), 
&_value_index_page_handle));
+
RETURN_IF_ERROR(_value_index_reader.parse(_value_index_page_handle.data()));
+_has_index_page = true;
+}
+}
+_num_values = _meta.num_values();
+return Status::OK();
 }
-bool IndexedColumnReader::has_value_index() const {
-return false; // TODO
+
+Status IndexedColumnReader::read_page(const PagePointer& pp, PageHandle* 
handle) const {
+auto cache = StoragePageCache::instance();
+PageCacheHandle cache_handle;
+StoragePageCache::CacheKey cache_key(_file->file_name(), pp.offset);
+if (cache->lookup(cache_key, _handle)) {
+// we find page in cache, use it
+*handle = PageHandle(std::move(cache_handle));
+return Status::OK();
+}
+// Now we read this from file.
+size_t page_size = pp.size;
+if (page_size < sizeof(uint32_t)) {
+return Status::Corruption(Substitute("Bad page, page size is too 
small, size=$0", page_size));
+}
+
+// Now we use this buffer to store page from storage, if this page is 
compressed
+// this buffer will assigned uncompressed page, and origin content will be 
freed.
+std::unique_ptr page(new uint8_t[page_size]);
+Slice page_slice(page.get(), page_size);
+RETURN_IF_ERROR(_file->read_at(pp.offset, page_slice));
+
+size_t data_size = page_size - 4;
+if (_verify_checksum) {
+uint32_t expect = decode_fixed32_le((uint8_t*)page_slice.data + 
page_slice.size - 4);
+uint32_t actual = crc32c::Value(page_slice.data, page_slice.size - 4);
+if (expect != actual) {
+return Status::Corruption(
+Substitute("Page checksum mismatch, actual=$0 vs expect=$1", 
actual, expect));
+}
+}
+
+// remove page's suffix
+page_slice.size = data_size;
+if (_compress_codec != nullptr) {
+PageDecompressor decompressor(page_slice, _compress_codec);
+
+Slice uncompressed_page;
+RETURN_IF_ERROR(decompressor.decompress_to(_page));
+
+// If decompressor create new heap memory for uncompressed data,
+// assign 

[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352067474
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/bitmap_index_reader.cpp
 ##
 @@ -17,10 +17,65 @@
 
 #include "olap/rowset/segment_v2/bitmap_index_reader.h"
 
+#include "olap/types.h"
+#include "runtime/mem_tracker.h"
+#include "runtime/mem_pool.h"
+
 namespace doris {
 namespace segment_v2 {
 
+Status BitmapIndexReader::load() {
+const IndexedColumnMetaPB dict_meta = _bitmap_index_meta.dict_column();
+const IndexedColumnMetaPB bitmap_meta = _bitmap_index_meta.bitmap_column();
+_has_null = _bitmap_index_meta.has_null();
+
+_dict_column_reader.reset(new IndexedColumnReader(_file, dict_meta));
+_bitmap_column_reader.reset(new IndexedColumnReader(_file, bitmap_meta));
+RETURN_IF_ERROR(_dict_column_reader->load());
+RETURN_IF_ERROR(_bitmap_column_reader->load());
+return Status::OK();
+}
+
+Status BitmapIndexReader::new_iterator(BitmapIndexIterator** iterator) {
+*iterator = new BitmapIndexIterator(this);
+return Status::OK();
+}
+
+Status BitmapIndexIterator::seek_dictionary(const void* value, bool* 
exact_match) {
+RETURN_IF_ERROR(_dict_column_iter.seek_at_or_after(value, exact_match));
+_current_rowid = _dict_column_iter.get_current_ordinal();
+return Status::OK();
+}
+
+Status BitmapIndexIterator::read_bitmap(rowid_t ordinal, Roaring* result) {
+DCHECK(0 <= ordinal && ordinal < _reader->bitmap_nums());
+
+Slice value;
+uint8_t nullmap;
+MemTracker mem_tracker;
+MemPool mem_pool(_tracker);
 
 Review comment:
   make it as member to avoid allocate and deallocate each time.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352065894
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/bitmap_index_reader.cpp
 ##
 @@ -17,10 +17,65 @@
 
 #include "olap/rowset/segment_v2/bitmap_index_reader.h"
 
+#include "olap/types.h"
+#include "runtime/mem_tracker.h"
+#include "runtime/mem_pool.h"
+
 namespace doris {
 namespace segment_v2 {
 
+Status BitmapIndexReader::load() {
+const IndexedColumnMetaPB dict_meta = _bitmap_index_meta.dict_column();
 
 Review comment:
   ```suggestion
   const IndexedColumnMetaPB& dict_meta = _bitmap_index_meta.dict_column();
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352063750
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/index_page.cpp
 ##
 @@ -67,57 +68,63 @@ Status IndexPageBuilder::get_first_key(Slice* key) const {
 
 ///
 
-IndexPageReader::IndexPageReader() : _parsed(false) {
-}
-
 Status IndexPageReader::parse(const Slice& data) {
-return Status(); // FIXME
-}
-
-size_t IndexPageReader::count() const {
-CHECK(_parsed) << "not parsed";
-return _footer.num_entries();
-}
+size_t buffer_len = data.size;
+const uint8_t* buffer = (uint8_t*)data.data;
+size_t footer_size = decode_fixed32_le(buffer + buffer_len - 4);
+std::string footer_buf(data.data + buffer_len - 4 - footer_size, 
footer_size);
+_footer.ParseFromString(footer_buf);
+size_t num_entries = _footer.num_entries();
+size_t entry_offset_start = buffer_len - 4 - footer_size - num_entries * 4;
+for(size_t i = 0; i < num_entries; i++) {
+size_t entry_offset = decode_fixed32_le(buffer + entry_offset_start);
+_entry_offsets.push_back(entry_offset);
 
 Review comment:
   Does entry_offsets really need to be stored in footer? I think it can be can 
be constructed when parsing entries.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r351996614
 
 

 ##
 File path: be/src/olap/in_list_predicate.cpp
 ##
 @@ -109,6 +111,44 @@ IN_LIST_PRED_EVALUATE(NotInListPredicate, ==)
 IN_LIST_PRED_COLUMN_BLOCK_EVALUATE(InListPredicate, !=)
 IN_LIST_PRED_COLUMN_BLOCK_EVALUATE(NotInListPredicate, ==)
 
+#define IN_LIST_PRED_BITMAP_EVALUATE(CLASS, OP) \
+template \
+Status CLASS::evaluate(const Schema& schema, const 
vector& iterators, uint32_t num_rows, Roaring* bitmap) 
const { \
+BitmapIndexIterator *iterator = iterators[_column_id]; \
+if (iterator == nullptr) { \
+return Status::OK(); \
+} \
+if (iterator->has_null_bitmap()) { \
+Roaring null_bitmap; \
+RETURN_IF_ERROR(iterator->read_null_bitmap(_bitmap)); \
+*bitmap -= null_bitmap; \
+} \
+std::string op = getop(OP); \
+Roaring roaring; \
+for (auto value:_values) { \
+bool exact_match; \
+Status s = iterator->seek_dictionary(, _match); \
+rowid_t seeked_ordinal = iterator->current_ordinal(); \
+if (!s.is_not_found()) { \
+if (!s.ok()) { return s; } \
+if (exact_match) { \
+Roaring bitmap; \
+RETURN_IF_ERROR(iterator->read_bitmap(seeked_ordinal, 
)); \
+roaring |= bitmap; \
+} \
+} \
+} \
+if (op.compare("==") == 0) { \
+*bitmap &= roaring; \
+} else { \
+*bitmap -= roaring; \
+} \
 
 Review comment:
   If give OP `&=` and `-=`, this can be rewrite with `*bitmap OP roaring;`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352066968
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/bitmap_index_reader.cpp
 ##
 @@ -17,10 +17,65 @@
 
 #include "olap/rowset/segment_v2/bitmap_index_reader.h"
 
+#include "olap/types.h"
+#include "runtime/mem_tracker.h"
+#include "runtime/mem_pool.h"
+
 namespace doris {
 namespace segment_v2 {
 
+Status BitmapIndexReader::load() {
+const IndexedColumnMetaPB dict_meta = _bitmap_index_meta.dict_column();
+const IndexedColumnMetaPB bitmap_meta = _bitmap_index_meta.bitmap_column();
+_has_null = _bitmap_index_meta.has_null();
+
+_dict_column_reader.reset(new IndexedColumnReader(_file, dict_meta));
+_bitmap_column_reader.reset(new IndexedColumnReader(_file, bitmap_meta));
+RETURN_IF_ERROR(_dict_column_reader->load());
+RETURN_IF_ERROR(_bitmap_column_reader->load());
+return Status::OK();
+}
+
+Status BitmapIndexReader::new_iterator(BitmapIndexIterator** iterator) {
+*iterator = new BitmapIndexIterator(this);
+return Status::OK();
+}
+
+Status BitmapIndexIterator::seek_dictionary(const void* value, bool* 
exact_match) {
+RETURN_IF_ERROR(_dict_column_iter.seek_at_or_after(value, exact_match));
+_current_rowid = _dict_column_iter.get_current_ordinal();
+return Status::OK();
+}
+
+Status BitmapIndexIterator::read_bitmap(rowid_t ordinal, Roaring* result) {
+DCHECK(0 <= ordinal && ordinal < _reader->bitmap_nums());
+
+Slice value;
+uint8_t nullmap;
+MemTracker mem_tracker;
+MemPool mem_pool(_tracker);
+size_t num_to_read = 1;
+ColumnBlock block(get_type_info(OLAP_FIELD_TYPE_VARCHAR), (uint8_t*) 
, , num_to_read, _pool);
 
 Review comment:
   You can get_type_info() in reader and pass it to iterator, then we can avoid 
calling get_type_info each time.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r351993764
 
 

 ##
 File path: be/src/olap/column_predicate.h
 ##
 @@ -18,12 +18,18 @@
 #ifndef DORIS_BE_SRC_OLAP_COLUMN_PREDICATE_H
 #define DORIS_BE_SRC_OLAP_COLUMN_PREDICATE_H
 
+#include 
 
 Review comment:
   Is it work if replacing this include with forward declaration of Roaring. We 
prefer including as few as possible in .h files in order to reduce compilation 
time.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r351995667
 
 

 ##
 File path: be/src/olap/comparison_predicate.cpp
 ##
 @@ -138,6 +142,75 @@ COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(LessEqualPredicate, 
<=)
 COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(GreaterPredicate, >)
 COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(GreaterEqualPredicate, >=)
 
+#define COMPARISON_PRED_BITMAP_EVALUATE(CLASS, OP) \
+template \
+Status CLASS::evaluate(const Schema& schema, const 
vector& iterators, uint32_t num_rows, Roaring* bitmap) 
const { \
+BitmapIndexIterator *iterator = iterators[_column_id]; \
+if (iterator == nullptr) { \
+return Status::OK(); \
+} \
+std::string op = getop(OP); \
+rowid_t ordinal_limit = iterator->bitmap_nums(); \
+if (iterator->has_null_bitmap()) { \
+ordinal_limit--; \
+Roaring null_bitmap; \
+RETURN_IF_ERROR(iterator->read_null_bitmap(_bitmap)); \
+*bitmap -= null_bitmap; \
+} \
+Roaring roaring; \
+bool exact_match; \
+Status s = iterator->seek_dictionary(&_value, _match); \
+rowid_t seeked_ordinal = iterator->current_ordinal(); \
+if (op.compare("==") == 0) { \
 
 Review comment:
   Why define this through compare? I'm not sure this compare can be avoided in 
compile time.
   I think it will be more clear and readable if define macro for each OP


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r351574128
 
 

 ##
 File path: gensrc/proto/segment_v2.proto
 ##
 @@ -220,3 +220,6 @@ message BitmapIndexColumnPB {
 optional BitmapType bitmap_type = 6 [default=ROARING_BITMAP];
 }
 
+message BitmapIndexFileFooterPB {
 
 Review comment:
   Add some comments for this PB and its conent


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r351994135
 
 

 ##
 File path: be/src/olap/column_predicate.h
 ##
 @@ -18,12 +18,18 @@
 #ifndef DORIS_BE_SRC_OLAP_COLUMN_PREDICATE_H
 #define DORIS_BE_SRC_OLAP_COLUMN_PREDICATE_H
 
+#include 
+
 #include "olap/column_block.h"
 #include "olap/selection_vector.h"
+#include "olap/rowset/segment_v2/bitmap_index_reader.h"
 
 Review comment:
   Forward declare


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352065945
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/bitmap_index_reader.cpp
 ##
 @@ -17,10 +17,65 @@
 
 #include "olap/rowset/segment_v2/bitmap_index_reader.h"
 
+#include "olap/types.h"
+#include "runtime/mem_tracker.h"
+#include "runtime/mem_pool.h"
+
 namespace doris {
 namespace segment_v2 {
 
+Status BitmapIndexReader::load() {
+const IndexedColumnMetaPB dict_meta = _bitmap_index_meta.dict_column();
+const IndexedColumnMetaPB bitmap_meta = _bitmap_index_meta.bitmap_column();
 
 Review comment:
   ```suggestion
   const IndexedColumnMetaPB& bitmap_meta = 
_bitmap_index_meta.bitmap_column();
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352058362
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/indexed_column_reader.cpp
 ##
 @@ -17,55 +17,254 @@
 
 #include "olap/rowset/segment_v2/indexed_column_reader.h"
 
+#include "env/env.h" // for RandomAccessFile
+#include "gutil/strings/substitute.h" // for Substitute
+#include "olap/key_coder.h"
+#include "olap/rowset/segment_v2/encoding_info.h" // for EncodingInfo
+#include "olap/rowset/segment_v2/index_page.h" // for IndexPageReader
+#include "olap/rowset/segment_v2/options.h" // for PageDecoderOptions
+#include "olap/rowset/segment_v2/page_compression.h"
+#include "olap/rowset/segment_v2/page_decoder.h" // for PagePointer
+#include "util/crc32c.h"
+#include "util/rle_encoding.h" // for RleDecoder
+
 namespace doris {
 namespace segment_v2 {
 
-IndexedColumnReader::IndexedColumnReader(RandomAccessFile* file) {
-// TODO
-}
-Status IndexedColumnReader::init() {
-return Status(); // TODO
-}
-Status 
IndexedColumnReader::new_iterator(std::unique_ptr* iter) 
{
-return Status(); // TODO
-}
-Status IndexedColumnReader::read_page(const PagePointer& pp, PageHandle* ret) {
-return Status(); // TODO
-}
-const IndexedColumnMetaPB& IndexedColumnReader::meta() const {
-static IndexedColumnMetaPB temp;
-return temp; // TODO
-}
-bool IndexedColumnReader::has_ordinal_index() const {
-return false; // TODO
+using strings::Substitute;
+
+Status IndexedColumnReader::load() {
+_type_info = get_type_info((FieldType)_meta.data_type());
+if (_type_info == nullptr) {
+return Status::NotSupported(Substitute("unsupported typeinfo, 
type=$0", _meta.data_type()));
+}
+RETURN_IF_ERROR(EncodingInfo::get(_type_info, _meta.encoding(), 
&_encoding_info));
+RETURN_IF_ERROR(get_block_compression_codec(_meta.compression(), 
&_compress_codec));
+_validx_key_coder = get_key_coder(_type_info->type());
+
+// read and parse ordinal index page when exists
+if (_meta.has_ordinal_index_meta()) {
+if (_meta.ordinal_index_meta().is_root_data_page()) {
+_sole_data_page = 
PagePointer(_meta.ordinal_index_meta().root_page());
+} else {
+RETURN_IF_ERROR(read_page(_meta.ordinal_index_meta().root_page(), 
&_ordinal_index_page_handle));
+
RETURN_IF_ERROR(_ordinal_index_reader.parse(_ordinal_index_page_handle.data()));
+_has_index_page = true;
+}
+}
+
+// read and parse value index page when exists
+if (_meta.has_value_index_meta()) {
+if (_meta.value_index_meta().is_root_data_page()) {
+_sole_data_page = 
PagePointer(_meta.value_index_meta().root_page());
+} else {
+RETURN_IF_ERROR(read_page(_meta.value_index_meta().root_page(), 
&_value_index_page_handle));
+
RETURN_IF_ERROR(_value_index_reader.parse(_value_index_page_handle.data()));
+_has_index_page = true;
+}
+}
+_num_values = _meta.num_values();
+return Status::OK();
 }
-bool IndexedColumnReader::has_value_index() const {
-return false; // TODO
+
+Status IndexedColumnReader::read_page(const PagePointer& pp, PageHandle* 
handle) const {
+auto cache = StoragePageCache::instance();
+PageCacheHandle cache_handle;
+StoragePageCache::CacheKey cache_key(_file->file_name(), pp.offset);
+if (cache->lookup(cache_key, _handle)) {
+// we find page in cache, use it
+*handle = PageHandle(std::move(cache_handle));
+return Status::OK();
+}
+// Now we read this from file.
+size_t page_size = pp.size;
+if (page_size < sizeof(uint32_t)) {
+return Status::Corruption(Substitute("Bad page, page size is too 
small, size=$0", page_size));
+}
+
+// Now we use this buffer to store page from storage, if this page is 
compressed
+// this buffer will assigned uncompressed page, and origin content will be 
freed.
+std::unique_ptr page(new uint8_t[page_size]);
+Slice page_slice(page.get(), page_size);
+RETURN_IF_ERROR(_file->read_at(pp.offset, page_slice));
+
+size_t data_size = page_size - 4;
+if (_verify_checksum) {
+uint32_t expect = decode_fixed32_le((uint8_t*)page_slice.data + 
page_slice.size - 4);
+uint32_t actual = crc32c::Value(page_slice.data, page_slice.size - 4);
+if (expect != actual) {
+return Status::Corruption(
+Substitute("Page checksum mismatch, actual=$0 vs expect=$1", 
actual, expect));
+}
+}
+
+// remove page's suffix
+page_slice.size = data_size;
+if (_compress_codec != nullptr) {
+PageDecompressor decompressor(page_slice, _compress_codec);
+
+Slice uncompressed_page;
+RETURN_IF_ERROR(decompressor.decompress_to(_page));
+
+// If decompressor create new heap memory for uncompressed data,
+// assign 

[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r351996356
 
 

 ##
 File path: be/src/olap/field.h
 ##
 @@ -226,6 +226,10 @@ class Field {
 void encode_ascending(const void* value, std::string* buf) const {
 _key_coder->encode_ascending(value, _index_size, buf);
 }
+
+void full_encode_ascending(const void* value, std::string* buf) const {
 
 Review comment:
   better to add comment to make clear it with `encode_ascending`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r351992549
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/row_ranges.h
 ##
 @@ -197,6 +199,14 @@ class RowRanges {
 *result = std::move(tmp_range);
 }
 
+static Roaring ranges_to_roaring(const RowRanges ) {
 
 Review comment:
   ```suggestion
   static Roaring ranges_to_roaring(const RowRanges& ranges) {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r351991539
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/segment_writer.cpp
 ##
 @@ -63,6 +63,9 @@ Status SegmentWriter::init(uint32_t write_mbytes_per_sec) {
 opts.need_zone_map = true;
 }
 // TODO set opts.need_bitmap_index based on table properties
+if (!column.is_key()) {
 
 Review comment:
   I'm not sure about this condition.
   In my opinion, both key and value in duplicate model can create bitmap.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #2319: Add Bitmap index reader

2019-11-29 Thread GitBox
imay commented on a change in pull request #2319: Add Bitmap index reader
URL: https://github.com/apache/incubator-doris/pull/2319#discussion_r352067869
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/bitmap_index_reader.cpp
 ##
 @@ -17,10 +17,65 @@
 
 #include "olap/rowset/segment_v2/bitmap_index_reader.h"
 
+#include "olap/types.h"
+#include "runtime/mem_tracker.h"
+#include "runtime/mem_pool.h"
+
 namespace doris {
 namespace segment_v2 {
 
+Status BitmapIndexReader::load() {
+const IndexedColumnMetaPB dict_meta = _bitmap_index_meta.dict_column();
+const IndexedColumnMetaPB bitmap_meta = _bitmap_index_meta.bitmap_column();
+_has_null = _bitmap_index_meta.has_null();
+
+_dict_column_reader.reset(new IndexedColumnReader(_file, dict_meta));
+_bitmap_column_reader.reset(new IndexedColumnReader(_file, bitmap_meta));
+RETURN_IF_ERROR(_dict_column_reader->load());
+RETURN_IF_ERROR(_bitmap_column_reader->load());
+return Status::OK();
+}
+
+Status BitmapIndexReader::new_iterator(BitmapIndexIterator** iterator) {
+*iterator = new BitmapIndexIterator(this);
+return Status::OK();
+}
+
+Status BitmapIndexIterator::seek_dictionary(const void* value, bool* 
exact_match) {
+RETURN_IF_ERROR(_dict_column_iter.seek_at_or_after(value, exact_match));
+_current_rowid = _dict_column_iter.get_current_ordinal();
+return Status::OK();
+}
+
+Status BitmapIndexIterator::read_bitmap(rowid_t ordinal, Roaring* result) {
+DCHECK(0 <= ordinal && ordinal < _reader->bitmap_nums());
+
+Slice value;
+uint8_t nullmap;
+MemTracker mem_tracker;
+MemPool mem_pool(_tracker);
+size_t num_to_read = 1;
+ColumnBlock block(get_type_info(OLAP_FIELD_TYPE_VARCHAR), (uint8_t*) 
, , num_to_read, _pool);
+ColumnBlockView column_block_view();
+
+RETURN_IF_ERROR(_bitmap_column_iter.seek_to_ordinal(ordinal));
+size_t num_read = num_to_read;
+RETURN_IF_ERROR(_bitmap_column_iter.next_batch(_read, 
_block_view));
+DCHECK(num_to_read == num_read);
+*result = Roaring::read(value.data, false);
+return Status::OK();
+}
+
+Status BitmapIndexIterator::read_union_bitmap(rowid_t from, rowid_t to, 
Roaring* result) {
+DCHECK(0 <= from && from <= to && to <= _reader->bitmap_nums());
 
+for (rowid_t pos = from; pos < to; pos++) {
+Roaring bitmap;
+RETURN_IF_ERROR(read_bitmap(pos, ));
 
 Review comment:
   It seems that we can read all bimap in one read.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org