[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333807895 ## File path: be/src/olap/field.h ## @@ -55,9 +55,11 @@ class Field { inline size_t field_size() const { return size() + 1; } inline size_t index_size() const { return _index_size; } -inline void set_to_max(char* buf) const { return _type_info->set_to_max(buf); } +virtual inline void set_to_max(char* buf) const { return _type_info->set_to_max(buf); } +//todo delete this method when segment v2 release +inline void set_to_max_v1(char* buf) const { return _type_info->set_to_max(buf); } Review comment: implement it the same with set_to_max means: when calling varchar field.set_to_max,using field._length to memset slice's data. but for segment v1, field._length is regarded as slice.size+sizeof(StringLengthType) So when calling segment v1 varchar field.set_max,the length of memset is beyond slice's actually size,this may cause subsequently tcmalloc core dumped.see error stack below: >#0 SLL_Next (t=0x) at src/linked_list.h:45 >#1 SLL_TryPop (rv=, list=0x436dce0) at src/linked_list.h:69 >#2 TryPop (rv=, this=0x436dce0) at src/thread_cache.h:220 >#3 Allocate (oom_handler=0x25e5700 , cl=5, size=64, this=) at src/thread_cache.h:379 >#4 malloc_fast_path (size=64) at src/tcmalloc.cc:1855 >#5 tc_new (size=64) at src/tcmalloc.cc:1976 >#6 0x00c21672 in doris::WrapperField::WrapperField (this=0x4e07d40, rep=, variable_len=, >is_string_type=) at /home/wangbo/incubator-doris/be/src/olap/wrapper_field.cpp:84 >#7 0x00c21b9e in doris::WrapperField::create (column=..., len=, len@entry=0) >at /home/wangbo/incubator-doris/be/src/olap/wrapper_field.cpp:51 >#8 0x00cb88e1 in doris::ColumnDataWriter::init (this=0x4bc4780) at /home/wangbo/incubator-doris/be/src/olap/rowset/column_data_writer.cpp:72 >#9 0x00ca273c in doris::AlphaRowsetWriter::_init (this=this@entry=0x4bf83c0) >at /home/wangbo/incubator-doris/be/src/olap/rowset/alpha_rowset_writer.cpp:275 >#10 0x00ca3694 in doris::AlphaRowsetWriter::init (this=0x4bf83c0, rowset_writer_context=...) >at /home/wangbo/incubator-doris/be/src/olap/rowset/alpha_rowset_writer.cpp:73 >#11 0x00c91b03 in doris::RowsetFactory::create_rowset_writer (context=..., output=output@entry=0x7fffcdf75318) >at /home/wangbo/incubator-doris/be/src/olap/rowset/rowset_factory.cpp:50 >#12 0x00bc9b3f in doris::TabletManager::_create_inital_rowset (this=this@entry=0x4d5a180, tablet=..., request=...) >at /home/wangbo/incubator-doris/be/src/olap/tablet_manager.cpp:1250 >#13 0x00bcc919 in doris::TabletManager::_internal_create_tablet (this=this@entry=0x4d5a180, alter_type=alter_type@entry=doris::SCHEMA_CHANGE, >request=..., is_schema_change_tablet=is_schema_change_tablet@entry=false, ref_tablet=..., data_dirs=...) >at /home/wangbo/incubator-doris/be/src/olap/tablet_manager.cpp:371 >#14 0x00bcdd72 in doris::TabletManager::create_tablet (this=this@entry=0x4d5a180, request=..., stores=...) >at /home/wangbo/incubator-doris/be/src/olap/tablet_manager.cpp:318 >#15 0x00b82642 in doris::StorageEngine::create_tablet (this=, request=...) >at /home/wangbo/incubator-doris/be/src/olap/storage_engine.cpp:760 >#16 0x00b4fac3 in doris::TestDeleteConditionHandler::SetUp (this=0x4e08460) >at /home/wangbo/incubator-doris/be/test/olap/delete_handler_test.cpp:173 >#17 0x02530f37 in void testing::internal::HandleSehExceptionsInMethodIfSupported(testing::Test*, void (testing::Test::*)(), char const*) () >#18 0x0252ba9c in void testing::internal::HandleExceptionsInMethodIfSupported(testing::Test*, void (testing::Test::*)(), char const*) () >#19 0x02510df7 in testing::Test::Run() () >#20 0x025116ce in testing::TestInfo::Run() () >#21 0x02511d25 in testing::TestCase::Run() () >#22 0x02518899 in testing::internal::UnitTestImpl::RunAllTests() () >#23 0x02531eb3 in bool testing::internal::HandleSehExceptionsInMethodIfSupported(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) () >#24 0x0252c7a2 in bool testing::internal::HandleExceptionsInMethodIfSupported(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) () >#25 0x02517570 in testing::UnitTest::Run() () >#26 0x009fb316 in RUN_ALL_TESTS () at /home/wangbo/incubator-doris/thirdparty/installed/include/gtest/gtest.h:2233 >#27 main (argc=, argv=0x7fffcdf75f88) at /home/wangbo/incubator-doris/be/test/olap/delete_handler_test.cpp:951 it seems that memset memory to 0xff may effect tcmalloc
[GitHub] [incubator-doris] chaoyli commented on a change in pull request #1950: Refactor txn manager methods
chaoyli commented on a change in pull request #1950: Refactor txn manager methods URL: https://github.com/apache/incubator-doris/pull/1950#discussion_r333824806 ## File path: be/src/olap/txn_manager.cpp ## @@ -122,7 +149,7 @@ OLAPStatus TxnManager::prepare_txn( OLAPStatus TxnManager::commit_txn( OlapMeta* meta, TPartitionId partition_id, TTransactionId transaction_id, TTabletId tablet_id, SchemaHash schema_hash, TabletUid tablet_uid, -const PUniqueId& load_id, RowsetSharedPtr rowset_ptr, bool is_recovery) { +const PUniqueId& load_id, RowsetSharedPtr& rowset_ptr, bool is_recovery) { Review comment: const RowsetSharedPtr& This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] chaoyli commented on a change in pull request #1950: Refactor txn manager methods
chaoyli commented on a change in pull request #1950: Refactor txn manager methods URL: https://github.com/apache/incubator-doris/pull/1950#discussion_r333825032 ## File path: be/src/olap/txn_manager.cpp ## @@ -76,6 +76,33 @@ TxnManager::TxnManager() { } } +OLAPStatus TxnManager::prepare_txn(TPartitionId partition_id, const TabletSharedPtr& tablet, TTransactionId transaction_id, Review comment: 1. May two prepare_txn() merged into one mothod? 2. partition_id can also get from tablet. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] chaoyli commented on a change in pull request #1950: Refactor txn manager methods
chaoyli commented on a change in pull request #1950: Refactor txn manager methods URL: https://github.com/apache/incubator-doris/pull/1950#discussion_r333824839 ## File path: be/src/olap/txn_manager.cpp ## @@ -211,7 +238,7 @@ OLAPStatus TxnManager::commit_txn( // remove a txn from txn manager OLAPStatus TxnManager::publish_txn(OlapMeta* meta, TPartitionId partition_id, TTransactionId transaction_id, TTabletId tablet_id, SchemaHash schema_hash, TabletUid tablet_uid, - Version& version, VersionHash& version_hash) { + const Version& version, VersionHash version_hash) { pair key(partition_id, transaction_id); Review comment: const VersionHash& This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] chaoyli commented on a change in pull request #1950: Refactor txn manager methods
chaoyli commented on a change in pull request #1950: Refactor txn manager methods URL: https://github.com/apache/incubator-doris/pull/1950#discussion_r333824873 ## File path: be/src/olap/txn_manager.h ## @@ -80,14 +97,14 @@ class TxnManager { OLAPStatus commit_txn(OlapMeta* meta, TPartitionId partition_id, TTransactionId transaction_id, TTabletId tablet_id, SchemaHash schema_hash, TabletUid tablet_uid, - const PUniqueId& load_id, RowsetSharedPtr rowset_ptr, + const PUniqueId& load_id, RowsetSharedPtr& rowset_ptr, bool is_recovery); // remove a txn from txn manager // not persist rowset meta because OLAPStatus publish_txn(OlapMeta* meta, TPartitionId partition_id, TTransactionId transaction_id, TTabletId tablet_id, SchemaHash schema_hash, TabletUid tablet_uid, - Version& version, VersionHash& version_hash); + const Version& version, VersionHash version_hash); Review comment: const VersionHash& This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] yangzhg commented on issue #1952: add create_time to information_schema.tables (issue #1909)
yangzhg commented on issue #1952: add create_time to information_schema.tables (issue #1909) URL: https://github.com/apache/incubator-doris/pull/1952#issuecomment-540892067 > @yangzhg > Hi, can you split this PR into two PRs? one for adding create_time and one for compile error? removed compile error code This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on issue #1952: add create_time to information_schema.tables (issue #1909)
imay commented on issue #1952: add create_time to information_schema.tables (issue #1909) URL: https://github.com/apache/incubator-doris/pull/1952#issuecomment-540884385 @yangzhg Hi, can you split this PR into two PRs? one for adding create_time and one for compile error? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] yangzhg opened a new pull request #1952: add create_time to information_schema.tables (issue #1909)
yangzhg opened a new pull request #1952: add create_time to information_schema.tables (issue #1909) URL: https://github.com/apache/incubator-doris/pull/1952 Add create_time to information_schema.tables (issue #1909) - be/src/exec/schema_scanner/schema_tables_scanner.cpp - fe/src/main/java/org/apache/doris/catalog/Table.java - fe/src/main/java/org/apache/doris/common/FeConstants.java - fe/src/main/java/org/apache/doris/common/FeMetaVersion.java - fe/src/main/java/org/apache/doris/service/FrontendServiceImpl.java - fe/src/test/java/org/apache/doris/catalog/TableTest.java - gensrc/thrift/FrontendService.thrift Fix some warning when compile type is debug using -Werror flag(because of use deprecated funcrtions)files as follow - be/src/olap/data_dir.cpp - be/src/util/disk_info.cpp - be/src/agent/cgroups_mgr.cpp This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333808141 ## File path: be/src/olap/rowset/segment_v2/column_writer.cpp ## @@ -72,13 +72,13 @@ class NullBitmapBuilder { }; ColumnWriter::ColumnWriter(const ColumnWriterOptions& opts, - const TypeInfo* typeinfo, + Field* field, Review comment: 👌 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333808109 ## File path: be/src/olap/rowset/segment_v2/segment_writer.cpp ## @@ -55,17 +55,17 @@ Status SegmentWriter::init(uint32_t write_mbytes_per_sec) { bool is_nullable = column.is_nullable(); column_meta->set_is_nullable(is_nullable); -// TODO(zc): we can add type_info into TabletColumn? -const TypeInfo* type_info = get_type_info(column.type()); -DCHECK(type_info != nullptr); - ColumnWriterOptions opts; opts.compression_type = segment_v2::CompressionTypePB::LZ4F; // now we create zone map for key columns if (column.is_key()) { opts.need_zone_map = true; } -std::unique_ptr writer(new ColumnWriter(opts, type_info, is_nullable, _output_file.get())); + +Field* field = FieldFactory::create(column); +DCHECK(field != nullptr); + +std::unique_ptr writer(new ColumnWriter(opts, field, is_nullable, _output_file.get())); Review comment: 👌 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333808062 ## File path: be/src/olap/rowset/segment_v2/segment_writer.cpp ## @@ -55,17 +55,17 @@ Status SegmentWriter::init(uint32_t write_mbytes_per_sec) { bool is_nullable = column.is_nullable(); column_meta->set_is_nullable(is_nullable); -// TODO(zc): we can add type_info into TabletColumn? -const TypeInfo* type_info = get_type_info(column.type()); -DCHECK(type_info != nullptr); - ColumnWriterOptions opts; opts.compression_type = segment_v2::CompressionTypePB::LZ4F; // now we create zone map for key columns if (column.is_key()) { opts.need_zone_map = true; } -std::unique_ptr writer(new ColumnWriter(opts, type_info, is_nullable, _output_file.get())); + +Field* field = FieldFactory::create(column); Review comment: 👌 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333807936 ## File path: be/src/olap/field.h ## @@ -55,9 +55,11 @@ class Field { inline size_t field_size() const { return size() + 1; } inline size_t index_size() const { return _index_size; } -inline void set_to_max(char* buf) const { return _type_info->set_to_max(buf); } +virtual inline void set_to_max(char* buf) const { return _type_info->set_to_max(buf); } +//todo delete this method when segment v2 release Review comment: 👌 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333807981 ## File path: be/src/olap/field.h ## @@ -400,6 +411,16 @@ class VarcharField: public Field { VarcharField* clone() const override { return new VarcharField(*this); } + +char* allocate_value_from_arena(Arena* arena) const { Review comment: 👌 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333807995 ## File path: be/src/olap/field.h ## @@ -400,6 +411,16 @@ class VarcharField: public Field { VarcharField* clone() const override { return new VarcharField(*this); } + +char* allocate_value_from_arena(Arena* arena) const { +return Field::allocate_string_value_from_arena(arena); +} + +void set_to_max(char* ch) { Review comment: 👌 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333807895 ## File path: be/src/olap/field.h ## @@ -55,9 +55,11 @@ class Field { inline size_t field_size() const { return size() + 1; } inline size_t index_size() const { return _index_size; } -inline void set_to_max(char* buf) const { return _type_info->set_to_max(buf); } +virtual inline void set_to_max(char* buf) const { return _type_info->set_to_max(buf); } +//todo delete this method when segment v2 release +inline void set_to_max_v1(char* buf) const { return _type_info->set_to_max(buf); } Review comment: implement it the same with set_to_max means: when calling varchar field.set_to_max,using field._length to memset slice's data. but for segment v1, field._length is regarded as slice.size+sizeof(StringLengthType) So when calling segment v1 varchar field.set_max,the length of memset is beyond slice's actually size,this may cause subsequently tcmalloc core dumped.see error stack below: >#0 SLL_Next (t=0x) at src/linked_list.h:45 >#1 SLL_TryPop (rv=, list=0x436dce0) at src/linked_list.h:69 >#2 TryPop (rv=, this=0x436dce0) at src/thread_cache.h:220 >#3 Allocate (oom_handler=0x25e5700 , cl=5, size=64, this=) at src/thread_cache.h:379 >#4 malloc_fast_path (size=64) at src/tcmalloc.cc:1855 >#5 tc_new (size=64) at src/tcmalloc.cc:1976 >#6 0x00c21672 in doris::WrapperField::WrapperField (this=0x4e07d40, rep=, variable_len=, >is_string_type=) at /home/wangbo/incubator-doris/be/src/olap/wrapper_field.cpp:84 >#7 0x00c21b9e in doris::WrapperField::create (column=..., len=, len@entry=0) >at /home/wangbo/incubator-doris/be/src/olap/wrapper_field.cpp:51 >#8 0x00cb88e1 in doris::ColumnDataWriter::init (this=0x4bc4780) at /home/wangbo/incubator-doris/be/src/olap/rowset/column_data_writer.cpp:72 >#9 0x00ca273c in doris::AlphaRowsetWriter::_init (this=this@entry=0x4bf83c0) >at /home/wangbo/incubator-doris/be/src/olap/rowset/alpha_rowset_writer.cpp:275 >#10 0x00ca3694 in doris::AlphaRowsetWriter::init (this=0x4bf83c0, rowset_writer_context=...) >at /home/wangbo/incubator-doris/be/src/olap/rowset/alpha_rowset_writer.cpp:73 >#11 0x00c91b03 in doris::RowsetFactory::create_rowset_writer (context=..., output=output@entry=0x7fffcdf75318) >at /home/wangbo/incubator-doris/be/src/olap/rowset/rowset_factory.cpp:50 >#12 0x00bc9b3f in doris::TabletManager::_create_inital_rowset (this=this@entry=0x4d5a180, tablet=..., request=...) >at /home/wangbo/incubator-doris/be/src/olap/tablet_manager.cpp:1250 >#13 0x00bcc919 in doris::TabletManager::_internal_create_tablet (this=this@entry=0x4d5a180, alter_type=alter_type@entry=doris::SCHEMA_CHANGE, >request=..., is_schema_change_tablet=is_schema_change_tablet@entry=false, ref_tablet=..., data_dirs=...) >at /home/wangbo/incubator-doris/be/src/olap/tablet_manager.cpp:371 >#14 0x00bcdd72 in doris::TabletManager::create_tablet (this=this@entry=0x4d5a180, request=..., stores=...) >at /home/wangbo/incubator-doris/be/src/olap/tablet_manager.cpp:318 >#15 0x00b82642 in doris::StorageEngine::create_tablet (this=, request=...) >at /home/wangbo/incubator-doris/be/src/olap/storage_engine.cpp:760 >#16 0x00b4fac3 in doris::TestDeleteConditionHandler::SetUp (this=0x4e08460) >at /home/wangbo/incubator-doris/be/test/olap/delete_handler_test.cpp:173 >#17 0x02530f37 in void testing::internal::HandleSehExceptionsInMethodIfSupported(testing::Test*, void (testing::Test::*)(), char const*) () >#18 0x0252ba9c in void testing::internal::HandleExceptionsInMethodIfSupported(testing::Test*, void (testing::Test::*)(), char const*) () >#19 0x02510df7 in testing::Test::Run() () >#20 0x025116ce in testing::TestInfo::Run() () >#21 0x02511d25 in testing::TestCase::Run() () >#22 0x02518899 in testing::internal::UnitTestImpl::RunAllTests() () >#23 0x02531eb3 in bool testing::internal::HandleSehExceptionsInMethodIfSupported(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) () >#24 0x0252c7a2 in bool testing::internal::HandleExceptionsInMethodIfSupported(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) () >#25 0x02517570 in testing::UnitTest::Run() () >#26 0x009fb316 in RUN_ALL_TESTS () at /home/wangbo/incubator-doris/thirdparty/installed/include/gtest/gtest.h:2233 >#27 main (argc=, argv=0x7fffcdf75f88) at /home/wangbo/incubator-doris/be/test/olap/delete_handler_test.cpp:951 it seems that memset memory to 0xff may effect tcmalloc
[GitHub] [incubator-doris] morningman opened a new issue #1951: Limit the memory usage of Loading process
morningman opened a new issue #1951: Limit the memory usage of Loading process URL: https://github.com/apache/incubator-doris/issues/1951 ## Motivation Current load framework using memtable to receive incoming loading data and flush to disk when reaching limit (default is 100MB). Each tablet corresponds to a memtable, so that if there are many tablets on a Backend, and the loading data is distributed evenly on each of these tablets, the total memory consumption can be very large because all memtables will no be flushed until they reaching the 100MB. For example, if there are 100 tablets on a Backend, the peak memory consumption can be 10GB(100 * 100MB), and this may cause process killed by system OOM. ## How to resolve There will be 2 levels of memory limit for loading process. One is loading process-level, and one is for Backend-level. 1. loading process-level limit User can set a memory limit for each loading process. And this limit will limit the memory consumption of a loading process on one Backend. From the aspect of code implements, we use a MemTracker to track the memory consumption of a loading process on a Backend. If memory reaches the limit. It will find the largest memtable can flush it to disk to reduce the memory consumption. 2. Backend-level limit Each Backend should has a total loading memory limit. This will limit total memory consumption of all loading processes on this Backend. And if this limit is reached, Load Mgr will pick a loading process with largest memory consumption and try to flush its memtable. ## Configuration For loading process-level limit, the default is 2GB, and user can it by session variable or load property. For Backend-level limit, the default is 60% of physical memory and max to 20GB. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] yiguolei opened a new pull request #1950: Refactor txn manager methods
yiguolei opened a new pull request #1950: Refactor txn manager methods URL: https://github.com/apache/incubator-doris/pull/1950 Most methods in TxnManager depend on tablet id , schema hash, tablet uid. All of them could be extracted from tablet object. So that using tablet as parameter instead of tablet_id, schema has, tablet uid. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333596185 ## File path: be/src/olap/rowset/segment_v2/segment_writer.cpp ## @@ -55,17 +55,17 @@ Status SegmentWriter::init(uint32_t write_mbytes_per_sec) { bool is_nullable = column.is_nullable(); column_meta->set_is_nullable(is_nullable); -// TODO(zc): we can add type_info into TabletColumn? -const TypeInfo* type_info = get_type_info(column.type()); -DCHECK(type_info != nullptr); - ColumnWriterOptions opts; opts.compression_type = segment_v2::CompressionTypePB::LZ4F; // now we create zone map for key columns if (column.is_key()) { opts.need_zone_map = true; } -std::unique_ptr writer(new ColumnWriter(opts, type_info, is_nullable, _output_file.get())); + +Field* field = FieldFactory::create(column); +DCHECK(field != nullptr); + +std::unique_ptr writer(new ColumnWriter(opts, field, is_nullable, _output_file.get())); Review comment: ```suggestion std::unique_ptr writer(new ColumnWriter(opts, std::move(field), is_nullable, _output_file.get())); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333596710 ## File path: be/src/olap/rowset/segment_v2/column_writer.cpp ## @@ -72,13 +72,13 @@ class NullBitmapBuilder { }; ColumnWriter::ColumnWriter(const ColumnWriterOptions& opts, - const TypeInfo* typeinfo, + Field* field, Review comment: ```suggestion std::unique_ptr&& field, ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333588088 ## File path: be/src/olap/field.h ## @@ -400,6 +411,16 @@ class VarcharField: public Field { VarcharField* clone() const override { return new VarcharField(*this); } + +char* allocate_value_from_arena(Arena* arena) const { Review comment: ```suggestion char* allocate_value_from_arena(Arena* arena) const override { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333588174 ## File path: be/src/olap/field.h ## @@ -400,6 +411,16 @@ class VarcharField: public Field { VarcharField* clone() const override { return new VarcharField(*this); } + +char* allocate_value_from_arena(Arena* arena) const { +return Field::allocate_string_value_from_arena(arena); +} + +void set_to_max(char* ch) { Review comment: ```suggestion void set_to_max(char* ch) override { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333593330 ## File path: be/src/olap/field.h ## @@ -55,9 +55,11 @@ class Field { inline size_t field_size() const { return size() + 1; } inline size_t index_size() const { return _index_size; } -inline void set_to_max(char* buf) const { return _type_info->set_to_max(buf); } +virtual inline void set_to_max(char* buf) const { return _type_info->set_to_max(buf); } +//todo delete this method when segment v2 release +inline void set_to_max_v1(char* buf) const { return _type_info->set_to_max(buf); } Review comment: I think there is no need to keep this function, implement it the same with set_to_max is OK. Because the value set is greater than what is set in TypeInfo. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333580905 ## File path: be/src/olap/field.h ## @@ -55,9 +55,11 @@ class Field { inline size_t field_size() const { return size() + 1; } inline size_t index_size() const { return _index_size; } -inline void set_to_max(char* buf) const { return _type_info->set_to_max(buf); } +virtual inline void set_to_max(char* buf) const { return _type_info->set_to_max(buf); } +//todo delete this method when segment v2 release Review comment: ```suggestion // TODO(wangbo) delete this method when segment v2 release ``` Should comment why this function stay here to let others know about this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943)
imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943) URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333594065 ## File path: be/src/olap/rowset/segment_v2/segment_writer.cpp ## @@ -55,17 +55,17 @@ Status SegmentWriter::init(uint32_t write_mbytes_per_sec) { bool is_nullable = column.is_nullable(); column_meta->set_is_nullable(is_nullable); -// TODO(zc): we can add type_info into TabletColumn? -const TypeInfo* type_info = get_type_info(column.type()); -DCHECK(type_info != nullptr); - ColumnWriterOptions opts; opts.compression_type = segment_v2::CompressionTypePB::LZ4F; // now we create zone map for key columns if (column.is_key()) { opts.need_zone_map = true; } -std::unique_ptr writer(new ColumnWriter(opts, type_info, is_nullable, _output_file.get())); + +Field* field = FieldFactory::create(column); Review comment: It's not a good way to write code like this, it made me feel that the field had been leaked. I think you can use a std::unique_ptr and std::move to finish it. ```suggestion std::unique_ptr field(FieldFactory::create(column)); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1931: Support segment zone map
imay commented on a change in pull request #1931: Support segment zone map URL: https://github.com/apache/incubator-doris/pull/1931#discussion_r333497324 ## File path: be/src/olap/rowset/segment_v2/column_zone_map.cpp ## @@ -73,15 +91,22 @@ Status ColumnZoneMapBuilder::flush() { RETURN_IF_ERROR(_page_builder->add((const uint8_t *)&data, &num)); // reset the variables // we should allocate max varchar length and set to max for min value -_reset_zone_map(); +_reset_page_zone_map(); return Status::OK(); } -void ColumnZoneMapBuilder::_reset_zone_map() { -_field->set_to_max(_zone_map.min_value); -_field->set_to_min(_zone_map.max_value); -_zone_map.has_null = false; -_zone_map.has_not_null = false; +void ColumnZoneMapBuilder::_reset_zone_map(ZoneMap& zone_map) { +_field->set_to_max(zone_map.min_value); +_field->set_to_min(zone_map.max_value); +zone_map.has_null = false; +zone_map.has_not_null = false; +} + +void ColumnZoneMapBuilder::_fill_zone_map_to_pb(ZoneMap* const from, ZoneMapPB* const to) { Review comment: ```suggestion void ColumnZoneMapBuilder::_fill_zone_map_to_pb(const ZoneMap& from, ZoneMapPB* const to) { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1931: Support segment zone map
imay commented on a change in pull request #1931: Support segment zone map URL: https://github.com/apache/incubator-doris/pull/1931#discussion_r333496301 ## File path: be/src/olap/rowset/segment_v2/column_zone_map.cpp ## @@ -73,15 +91,22 @@ Status ColumnZoneMapBuilder::flush() { RETURN_IF_ERROR(_page_builder->add((const uint8_t *)&data, &num)); // reset the variables // we should allocate max varchar length and set to max for min value -_reset_zone_map(); +_reset_page_zone_map(); return Status::OK(); } -void ColumnZoneMapBuilder::_reset_zone_map() { -_field->set_to_max(_zone_map.min_value); -_field->set_to_min(_zone_map.max_value); -_zone_map.has_null = false; -_zone_map.has_not_null = false; +void ColumnZoneMapBuilder::_reset_zone_map(ZoneMap& zone_map) { Review comment: prefer pointer if param will be changed in function This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1931: Support segment zone map
imay commented on a change in pull request #1931: Support segment zone map URL: https://github.com/apache/incubator-doris/pull/1931#discussion_r333570926 ## File path: be/src/olap/rowset/segment_v2/segment.cpp ## @@ -50,17 +52,69 @@ Status Segment::open() { RETURN_IF_ERROR(Env::Default()->new_random_access_file(_fname, &_input_file)); // parse footer to get meta RETURN_IF_ERROR(_parse_footer()); -// parse short key index -RETURN_IF_ERROR(_parse_index()); -// initial all column reader -RETURN_IF_ERROR(_initial_column_readers()); + return Status::OK(); } -std::unique_ptr Segment::new_iterator(const Schema& schema, const StorageReadOptions& read_options) { -auto it = std::unique_ptr(new SegmentIterator(this->shared_from_this(), schema)); -it->init(read_options); -return it; +Status Segment::new_iterator( +const Schema& schema, +const StorageReadOptions& read_options, +std::unique_ptr* iter) { + +if (read_options.conditions != nullptr) { +for(auto& column_condition : read_options.conditions->columns()) { +int32_t column_id = column_condition.first; +auto entry = _column_id_to_footer_ordinal.find(column_id); +if (entry == _column_id_to_footer_ordinal.end()) { +continue; +} +ColumnMetaPB c_meta = _footer.columns(entry->second); Review comment: ```suggestion auto& c_meta = _footer.columns(entry->second); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1931: Support segment zone map
imay commented on a change in pull request #1931: Support segment zone map URL: https://github.com/apache/incubator-doris/pull/1931#discussion_r333492306 ## File path: be/src/olap/rowset/segment_v2/segment.cpp ## @@ -50,17 +52,69 @@ Status Segment::open() { RETURN_IF_ERROR(Env::Default()->new_random_access_file(_fname, &_input_file)); // parse footer to get meta RETURN_IF_ERROR(_parse_footer()); -// parse short key index -RETURN_IF_ERROR(_parse_index()); -// initial all column reader -RETURN_IF_ERROR(_initial_column_readers()); + return Status::OK(); } -std::unique_ptr Segment::new_iterator(const Schema& schema, const StorageReadOptions& read_options) { -auto it = std::unique_ptr(new SegmentIterator(this->shared_from_this(), schema)); -it->init(read_options); -return it; +Status Segment::new_iterator( +const Schema& schema, +const StorageReadOptions& read_options, +std::unique_ptr* iter) { + +if (read_options.conditions != nullptr) { +for(auto& column_condition : read_options.conditions->columns()) { Review comment: ```suggestion for (auto& column_condition : read_options.conditions->columns()) { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay closed issue #1892: Enhance the speed of avg function
imay closed issue #1892: Enhance the speed of avg function URL: https://github.com/apache/incubator-doris/issues/1892 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch master updated: Enhance the speed of avg function (#1889)
This is an automated email from the ASF dual-hosted git repository. zhaoc pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-doris.git The following commit(s) were added to refs/heads/master by this push: new e267d03 Enhance the speed of avg function (#1889) e267d03 is described below commit e267d031bb1402cf7fa32b8196553fd9b1266cb3 Author: EmmyMiao87 <522274...@qq.com> AuthorDate: Thu Oct 10 22:43:46 2019 +0800 Enhance the speed of avg function (#1889) This commit enable the avg operator in fe instead of converting the avg function into sum/count. Also, this commit fix the bug of deciamlv2 avg which cause the core in be. The int128 could not be assinged directly. The speed of avg function is similar to sum function after enhancement. --- be/src/exprs/aggregate_functions.cpp | 129 +++-- be/src/exprs/aggregate_functions.h | 2 + .../apache/doris/analysis/FunctionCallExpr.java| 5 +- .../java/org/apache/doris/analysis/SelectStmt.java | 43 +-- .../java/org/apache/doris/catalog/FunctionSet.java | 2 +- 5 files changed, 75 insertions(+), 106 deletions(-) diff --git a/be/src/exprs/aggregate_functions.cpp b/be/src/exprs/aggregate_functions.cpp index 0a513f3..75de6da 100644 --- a/be/src/exprs/aggregate_functions.cpp +++ b/be/src/exprs/aggregate_functions.cpp @@ -177,61 +177,6 @@ void AggregateFunctions::count_remove( } } -struct AvgState { -double sum; -int64_t count; -}; - -struct DecimalAvgState { -DecimalVal sum; -int64_t count; -}; - -struct DecimalV2AvgState { -DecimalV2Val sum; -int64_t count; -}; - -void AggregateFunctions::avg_init(FunctionContext* ctx, StringVal* dst) { -dst->is_null = false; -dst->len = sizeof(AvgState); -dst->ptr = ctx->allocate(dst->len); -memset(dst->ptr, 0, sizeof(AvgState)); -} - -void AggregateFunctions::decimal_avg_init(FunctionContext* ctx, StringVal* dst) { -dst->is_null = false; -dst->len = sizeof(DecimalAvgState); -dst->ptr = ctx->allocate(dst->len); -// memset(dst->ptr, 0, sizeof(DecimalAvgState)); -DecimalAvgState* avg = reinterpret_cast(dst->ptr); -avg->count = 0; -avg->sum.set_to_zero(); -} - -void AggregateFunctions::decimalv2_avg_init(FunctionContext* ctx, StringVal* dst) { -dst->is_null = false; -dst->len = sizeof(DecimalV2AvgState); -dst->ptr = ctx->allocate(dst->len); -// memset(dst->ptr, 0, sizeof(DecimalAvgState)); -DecimalV2AvgState* avg = reinterpret_cast(dst->ptr); -avg->count = 0; -avg->sum.set_to_zero(); -} - - -template -void AggregateFunctions::avg_update(FunctionContext* ctx, const T& src, StringVal* dst) { -if (src.is_null) { -return; -} -DCHECK(dst->ptr != NULL); -DCHECK_EQ(sizeof(AvgState), dst->len); -AvgState* avg = reinterpret_cast(dst->ptr); -avg->sum += src.val; -++avg->count; -} - struct PercentileApproxState { public: PercentileApproxState() : digest(new TDigest()){ @@ -305,6 +250,59 @@ DoubleVal AggregateFunctions::percentile_approx_finalize(FunctionContext* ctx, c return DoubleVal(result); } +struct AvgState { +double sum = 0; +int64_t count = 0; +}; + +struct DecimalAvgState { +DecimalVal sum; +int64_t count; +}; + +struct DecimalV2AvgState { +DecimalV2Val sum; +int64_t count = 0; +}; + +void AggregateFunctions::avg_init(FunctionContext* ctx, StringVal* dst) { +dst->is_null = false; +dst->len = sizeof(AvgState); +dst->ptr = ctx->allocate(dst->len); +new (dst->ptr) AvgState; +} + +void AggregateFunctions::decimal_avg_init(FunctionContext* ctx, StringVal* dst) { +dst->is_null = false; +dst->len = sizeof(DecimalAvgState); +dst->ptr = ctx->allocate(dst->len); +// memset(dst->ptr, 0, sizeof(DecimalAvgState)); +DecimalAvgState* avg = reinterpret_cast(dst->ptr); +avg->count = 0; +avg->sum.set_to_zero(); +} + +void AggregateFunctions::decimalv2_avg_init(FunctionContext* ctx, StringVal* dst) { +dst->is_null = false; +dst->len = sizeof(DecimalV2AvgState); +// The memroy for int128 need to be aligned by 16. +// So the constructor has been used instead of allocating memory. +// Also, it will be release in finalize. +dst->ptr = (uint8_t*) new DecimalV2AvgState; +} + +template +void AggregateFunctions::avg_update(FunctionContext* ctx, const T& src, StringVal* dst) { +if (src.is_null) { +return; +} +DCHECK(dst->ptr != NULL); +DCHECK_EQ(sizeof(AvgState), dst->len); +AvgState* avg = reinterpret_cast(dst->ptr); +avg->sum += src.val; +++avg->count; +} + void AggregateFunctions::decimal_avg_update(FunctionContext* ctx, const DecimalVal& src, StringVal* dst) { @@ -341,6 +339,15 @@ void AggregateFunctions::decimalv2_avg_update(FunctionContext* ctx, ++avg->count; } +StringVal AggregateFunctions::decimalv2_av
[GitHub] [incubator-doris] imay merged pull request #1889: Enhance the speed of avg function
imay merged pull request #1889: Enhance the speed of avg function URL: https://github.com/apache/incubator-doris/pull/1889 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] shengyunyao opened a new pull request #1939: Added: Add tdigest compression param for pencentile_approx function
shengyunyao opened a new pull request #1939: Added: Add tdigest compression param for pencentile_approx function URL: https://github.com/apache/incubator-doris/pull/1939 Add tdigest compression param for pencentile_approx function. The compression param should be in range (0, 1] and we can get a more precise result and cost more time when setting the param a bigger value. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] shengyunyao closed pull request #1939: Added: Add tdigest compression param for pencentile_approx function
shengyunyao closed pull request #1939: Added: Add tdigest compression param for pencentile_approx function URL: https://github.com/apache/incubator-doris/pull/1939 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1948: Add object type
imay commented on a change in pull request #1948: Add object type URL: https://github.com/apache/incubator-doris/pull/1948#discussion_r333428334 ## File path: gensrc/proto/segment_v2.proto ## @@ -53,6 +53,7 @@ enum EncodingTypePB { RLE = 4; DICT_ENCODING = 5; BIT_SHUFFLE = 6; +BINARY_PLAIN_ENCODING = 7; Review comment: Better to add some comment about difference between PLAIN_ENCODING and BINARY_PLAIN_ENCODING. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1948: Add object type
imay commented on a change in pull request #1948: Add object type URL: https://github.com/apache/incubator-doris/pull/1948#discussion_r333454949 ## File path: fe/src/main/jflex/sql_scanner.flex ## @@ -218,6 +218,7 @@ import org.apache.doris.common.util.SqlUtils; keywordMap.put("not", new Integer(SqlParserSymbols.KW_NOT)); keywordMap.put("null", new Integer(SqlParserSymbols.KW_NULL)); keywordMap.put("nulls", new Integer(SqlParserSymbols.KW_NULLS)); +keywordMap.put("object", new Integer(SqlParserSymbols.KW_OBJECT)); Review comment: Do we need to expose OBJECT to Doris USER? I think we should expose specified type to user, such as HLL, GEO_SHAPE, BITMAP? How about adding attribute for column to identify what specified type it is? Such as OBJECT("hll"), OBJECT("bitmap")? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] chaoyli merged pull request #1949: Remove redundant method in rowset meta manager
chaoyli merged pull request #1949: Remove redundant method in rowset meta manager URL: https://github.com/apache/incubator-doris/pull/1949 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch master updated (b72a4a4 -> e4f3e8f)
This is an automated email from the ASF dual-hosted git repository. lichaoyong pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. from b72a4a4 Add tablet meta checkpoint mechanism (#1936) add e4f3e8f Remove redundant method in rowset meta manager (#1949) No new revisions were added by this update. Summary of changes: be/src/olap/data_dir.cpp | 4 +--- be/src/olap/rowset/rowset_meta_manager.cpp | 14 +- be/src/olap/rowset/rowset_meta_manager.h | 4 +--- be/src/olap/tablet.cpp | 4 +++- be/src/olap/txn_manager.cpp | 11 ++- be/test/olap/rowset/rowset_meta_manager_test.cpp | 4 +++- 6 files changed, 19 insertions(+), 22 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] shengyunyao commented on a change in pull request #1939: Added: Add tdigest compression param for pencentile_approx function
shengyunyao commented on a change in pull request #1939: Added: Add tdigest compression param for pencentile_approx function URL: https://github.com/apache/incubator-doris/pull/1939#discussion_r333457738 ## File path: fe/src/main/java/org/apache/doris/analysis/FunctionCallExpr.java ## @@ -421,13 +421,23 @@ private void analyzeBuiltinAggFunction(Analyzer analyzer) throws AnalysisExcepti } if (fnName.getFunction().equalsIgnoreCase("percentile_approx")) { -if (children.size() != 2) { -throw new AnalysisException("percentile_approx(expr, DOUBLE) requires two parameters"); +if (children.size() != 2 && children.size() != 3) { +throw new AnalysisException("percentile_approx(expr, DOUBLE [, B]) requires two or three parameters"); } if (!getChild(1).isConstant()) { throw new AnalysisException("percentile_approx requires second parameter must be a constant : " + this.toSql()); } +if (children.size() == 3) { +if (!getChild(2).isConstant()) { +throw new AnalysisException("percentile_approx requires the third parameter must be a constant : " ++ this.toSql()); +} +double param_3 = Expr.getConstFromExpr(getChild(2)); Review comment: Get it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1889: Enhance the speed of avg function
imay commented on a change in pull request #1889: Enhance the speed of avg function URL: https://github.com/apache/incubator-doris/pull/1889#discussion_r333415220 ## File path: be/src/exprs/aggregate_functions.cpp ## @@ -305,6 +250,59 @@ DoubleVal AggregateFunctions::percentile_approx_finalize(FunctionContext* ctx, c return DoubleVal(result); } +struct AvgState { +double sum = 0; +int64_t count = 0; +}; + +struct DecimalAvgState { +DecimalVal sum; +int64_t count; +}; + +struct DecimalV2AvgState { +DecimalV2Val sum; +int64_t count = 0; +}; + +void AggregateFunctions::avg_init(FunctionContext* ctx, StringVal* dst) { +dst->is_null = false; +dst->len = sizeof(AvgState); +dst->ptr = ctx->allocate(dst->len); +dst->ptr = (uint8_t*) new (dst->ptr) AvgState; Review comment: ```suggestion new (dst->ptr) AvgState; ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1889: Enhance the speed of avg function
imay commented on a change in pull request #1889: Enhance the speed of avg function URL: https://github.com/apache/incubator-doris/pull/1889#discussion_r333416274 ## File path: be/src/exprs/aggregate_functions.cpp ## @@ -341,6 +339,17 @@ void AggregateFunctions::decimalv2_avg_update(FunctionContext* ctx, ++avg->count; } +StringVal AggregateFunctions::decimalv2_avg_serialize( +FunctionContext* ctx, const StringVal& src) { +if (src.is_null) { Review comment: Do we need this if? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] yiguolei opened a new pull request #1949: Remove redundant method in rowset meta manager
yiguolei opened a new pull request #1949: Remove redundant method in rowset meta manager URL: https://github.com/apache/incubator-doris/pull/1949 There are two save rowset method in rowset meta manager using different parameters. One is rowset meta and the other is string. Most of the code is same, and it is very difficult to maintain in ECON mode, so that I remove them and using the RowsetMetaPB. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1930: Replace std::regex with RE2
imay commented on a change in pull request #1930: Replace std::regex with RE2 URL: https://github.com/apache/incubator-doris/pull/1930#discussion_r77465 ## File path: be/test/olap/tablet_mgr_test.cpp ## @@ -209,6 +209,21 @@ TEST_F(TabletMgrTest, DropTablet) { ASSERT_TRUE(!dir_exist); } +TEST_F(TabletMgrTest, GetRowsetId) { +std::string path = _engine_data_path + "/data/0/15007/368169781/020100020003_0_0.dat"; +TTabletId tid; +TSchemaHash schema_hash; +ASSERT_TRUE(_tablet_mgr.get_tablet_id_and_schema_hash_from_path(path, &tid, &schema_hash)); +ASSERT_EQ(15007, tid); +ASSERT_EQ(368169781, schema_hash); Review comment: OK This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] imay commented on a change in pull request #1930: Replace std::regex with RE2
imay commented on a change in pull request #1930: Replace std::regex with RE2 URL: https://github.com/apache/incubator-doris/pull/1930#discussion_r77401 ## File path: be/src/olap/tablet_manager.cpp ## @@ -682,42 +682,19 @@ TabletSharedPtr TabletManager::get_tablet(TTabletId tablet_id, SchemaHash schema return nullptr; } // get_tablet -bool TabletManager::get_tablet_id_and_schema_hash_from_path(const std::string& path, -TTabletId* tablet_id, TSchemaHash* schema_hash) { -std::vector data_dirs = StorageEngine::instance()->get_stores(); -for (auto data_dir : data_dirs) { -const std::string& data_dir_path = data_dir->path(); -if (path.find(data_dir_path) != std::string::npos) { -std::string pattern = data_dir_path + "/data/\\d+/(\\d+)/?(\\d+)?"; -std::regex rgx (pattern.c_str()); -std::smatch sm; -bool ret = std::regex_search(path, sm, rgx); -if (ret) { -if (sm.size() == 3) { -*tablet_id = std::strtoll(sm.str(1).c_str(), nullptr, 10); -*schema_hash = std::strtoll(sm.str(2).c_str(), nullptr, 10); -return true; -} else { -LOG(WARNING) << "invalid match. match size:" << sm.size(); -return false; -} -} -} -} -return false; +bool TabletManager::get_tablet_id_and_schema_hash_from_path( +const std::string& path, TTabletId* tablet_id, TSchemaHash* schema_hash) { +static re2::LazyRE2 re = {"/data/\\d+/(\\d+)/(\\d+)"}; +return RE2::PartialMatch(path, *re, tablet_id, schema_hash); Review comment: I don't know why we should check if it's a Doris' data path? Is is possible that input path is not Doris' data path? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] yiguolei commented on a change in pull request #1930: Replace std::regex with RE2
yiguolei commented on a change in pull request #1930: Replace std::regex with RE2 URL: https://github.com/apache/incubator-doris/pull/1930#discussion_r76050 ## File path: be/test/olap/tablet_mgr_test.cpp ## @@ -209,6 +209,21 @@ TEST_F(TabletMgrTest, DropTablet) { ASSERT_TRUE(!dir_exist); } +TEST_F(TabletMgrTest, GetRowsetId) { +std::string path = _engine_data_path + "/data/0/15007/368169781/020100020003_0_0.dat"; +TTabletId tid; +TSchemaHash schema_hash; +ASSERT_TRUE(_tablet_mgr.get_tablet_id_and_schema_hash_from_path(path, &tid, &schema_hash)); +ASSERT_EQ(15007, tid); +ASSERT_EQ(368169781, schema_hash); Review comment: should add some bad case such as the path is not a rowset path. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] yiguolei commented on a change in pull request #1930: Replace std::regex with RE2
yiguolei commented on a change in pull request #1930: Replace std::regex with RE2 URL: https://github.com/apache/incubator-doris/pull/1930#discussion_r75444 ## File path: be/src/olap/tablet_manager.cpp ## @@ -682,42 +682,19 @@ TabletSharedPtr TabletManager::get_tablet(TTabletId tablet_id, SchemaHash schema return nullptr; } // get_tablet -bool TabletManager::get_tablet_id_and_schema_hash_from_path(const std::string& path, -TTabletId* tablet_id, TSchemaHash* schema_hash) { -std::vector data_dirs = StorageEngine::instance()->get_stores(); -for (auto data_dir : data_dirs) { -const std::string& data_dir_path = data_dir->path(); -if (path.find(data_dir_path) != std::string::npos) { -std::string pattern = data_dir_path + "/data/\\d+/(\\d+)/?(\\d+)?"; -std::regex rgx (pattern.c_str()); -std::smatch sm; -bool ret = std::regex_search(path, sm, rgx); -if (ret) { -if (sm.size() == 3) { -*tablet_id = std::strtoll(sm.str(1).c_str(), nullptr, 10); -*schema_hash = std::strtoll(sm.str(2).c_str(), nullptr, 10); -return true; -} else { -LOG(WARNING) << "invalid match. match size:" << sm.size(); -return false; -} -} -} -} -return false; +bool TabletManager::get_tablet_id_and_schema_hash_from_path( +const std::string& path, TTabletId* tablet_id, TSchemaHash* schema_hash) { +static re2::LazyRE2 re = {"/data/\\d+/(\\d+)/(\\d+)"}; +return RE2::PartialMatch(path, *re, tablet_id, schema_hash); Review comment: Should check if the path is DORIS's data path. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org