[GitHub] [incubator-doris] chaoyli commented on a change in pull request #1950: Refactor txn manager methods

2019-10-10 Thread GitBox
chaoyli commented on a change in pull request #1950: Refactor txn manager 
methods
URL: https://github.com/apache/incubator-doris/pull/1950#discussion_r333824806
 
 

 ##
 File path: be/src/olap/txn_manager.cpp
 ##
 @@ -122,7 +149,7 @@ OLAPStatus TxnManager::prepare_txn(
 OLAPStatus TxnManager::commit_txn(
 OlapMeta* meta, TPartitionId partition_id, TTransactionId transaction_id,
 TTabletId tablet_id, SchemaHash schema_hash, TabletUid tablet_uid,
-const PUniqueId& load_id, RowsetSharedPtr rowset_ptr, bool is_recovery) {
+const PUniqueId& load_id, RowsetSharedPtr& rowset_ptr, bool is_recovery) {
 
 Review comment:
   const RowsetSharedPtr&


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli commented on a change in pull request #1950: Refactor txn manager methods

2019-10-10 Thread GitBox
chaoyli commented on a change in pull request #1950: Refactor txn manager 
methods
URL: https://github.com/apache/incubator-doris/pull/1950#discussion_r333825032
 
 

 ##
 File path: be/src/olap/txn_manager.cpp
 ##
 @@ -76,6 +76,33 @@ TxnManager::TxnManager() {
 }
 }
 
+OLAPStatus TxnManager::prepare_txn(TPartitionId partition_id, const 
TabletSharedPtr& tablet, TTransactionId transaction_id, 
 
 Review comment:
   1. May two prepare_txn() merged into one mothod?
   2. partition_id can also get from tablet.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli commented on a change in pull request #1950: Refactor txn manager methods

2019-10-10 Thread GitBox
chaoyli commented on a change in pull request #1950: Refactor txn manager 
methods
URL: https://github.com/apache/incubator-doris/pull/1950#discussion_r333824839
 
 

 ##
 File path: be/src/olap/txn_manager.cpp
 ##
 @@ -211,7 +238,7 @@ OLAPStatus TxnManager::commit_txn(
 // remove a txn from txn manager
 OLAPStatus TxnManager::publish_txn(OlapMeta* meta, TPartitionId partition_id, 
TTransactionId transaction_id,
TTabletId tablet_id, SchemaHash 
schema_hash, TabletUid tablet_uid,
-   Version& version, VersionHash& 
version_hash) {
+   const Version& version, VersionHash 
version_hash) {
 pair key(partition_id, transaction_id);
 
 Review comment:
   const VersionHash&


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli commented on a change in pull request #1950: Refactor txn manager methods

2019-10-10 Thread GitBox
chaoyli commented on a change in pull request #1950: Refactor txn manager 
methods
URL: https://github.com/apache/incubator-doris/pull/1950#discussion_r333824873
 
 

 ##
 File path: be/src/olap/txn_manager.h
 ##
 @@ -80,14 +97,14 @@ class TxnManager {
 
 OLAPStatus commit_txn(OlapMeta* meta, TPartitionId partition_id, 
TTransactionId transaction_id,
   TTabletId tablet_id, SchemaHash schema_hash, 
TabletUid tablet_uid, 
-  const PUniqueId& load_id, RowsetSharedPtr 
rowset_ptr, 
+  const PUniqueId& load_id, RowsetSharedPtr& 
rowset_ptr, 
   bool is_recovery);
 
 // remove a txn from txn manager
 // not persist rowset meta because 
 OLAPStatus publish_txn(OlapMeta* meta, TPartitionId partition_id, 
TTransactionId transaction_id,
TTabletId tablet_id, SchemaHash schema_hash, 
TabletUid tablet_uid, 
-   Version& version, VersionHash& version_hash);
+   const Version& version, VersionHash version_hash);
 
 Review comment:
   const VersionHash&


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] yangzhg commented on issue #1952: add create_time to information_schema.tables (issue #1909)

2019-10-10 Thread GitBox
yangzhg commented on issue #1952: add create_time to information_schema.tables 
(issue #1909)
URL: https://github.com/apache/incubator-doris/pull/1952#issuecomment-540892067
 
 
   > @yangzhg
   > Hi, can you split this PR into two PRs? one for adding create_time and one 
for compile error?
   
   removed compile error code


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on issue #1952: add create_time to information_schema.tables (issue #1909)

2019-10-10 Thread GitBox
imay commented on issue #1952: add create_time to information_schema.tables 
(issue #1909)
URL: https://github.com/apache/incubator-doris/pull/1952#issuecomment-540884385
 
 
   @yangzhg 
   Hi, can you split this PR into two PRs? one for adding create_time and one 
for compile error?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] yangzhg opened a new pull request #1952: add create_time to information_schema.tables (issue #1909)

2019-10-10 Thread GitBox
yangzhg opened a new pull request #1952: add create_time to 
information_schema.tables (issue #1909)
URL: https://github.com/apache/incubator-doris/pull/1952
 
 
   Add create_time to information_schema.tables (issue #1909)
   
   - be/src/exec/schema_scanner/schema_tables_scanner.cpp
   -  fe/src/main/java/org/apache/doris/catalog/Table.java
   - fe/src/main/java/org/apache/doris/common/FeConstants.java
   - fe/src/main/java/org/apache/doris/common/FeMetaVersion.java
   - fe/src/main/java/org/apache/doris/service/FrontendServiceImpl.java
   - fe/src/test/java/org/apache/doris/catalog/TableTest.java
   - gensrc/thrift/FrontendService.thrift
   
   
   Fix some warning when compile type is debug using -Werror flag(because of 
use deprecated funcrtions)files as follow 
   
   - be/src/olap/data_dir.cpp
   - be/src/util/disk_info.cpp
   - be/src/agent/cgroups_mgr.cpp


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)

2019-10-10 Thread GitBox
wangbo commented on a change in pull request #1944: Segment v2 use string's 
real length(#1943)
URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333808141
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/column_writer.cpp
 ##
 @@ -72,13 +72,13 @@ class NullBitmapBuilder {
 };
 
 ColumnWriter::ColumnWriter(const ColumnWriterOptions& opts,
-   const TypeInfo* typeinfo,
+   Field* field,
 
 Review comment:
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)

2019-10-10 Thread GitBox
wangbo commented on a change in pull request #1944: Segment v2 use string's 
real length(#1943)
URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333808109
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/segment_writer.cpp
 ##
 @@ -55,17 +55,17 @@ Status SegmentWriter::init(uint32_t write_mbytes_per_sec) {
 bool is_nullable = column.is_nullable();
 column_meta->set_is_nullable(is_nullable);
 
-// TODO(zc): we can add type_info into TabletColumn?
-const TypeInfo* type_info = get_type_info(column.type());
-DCHECK(type_info != nullptr);
-
 ColumnWriterOptions opts;
 opts.compression_type = segment_v2::CompressionTypePB::LZ4F;
 // now we create zone map for key columns
 if (column.is_key()) {
 opts.need_zone_map = true;
 }
-std::unique_ptr writer(new ColumnWriter(opts, type_info, 
is_nullable, _output_file.get()));
+
+Field* field = FieldFactory::create(column);
+DCHECK(field != nullptr);
+
+std::unique_ptr writer(new ColumnWriter(opts, field, 
is_nullable, _output_file.get()));
 
 Review comment:
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)

2019-10-10 Thread GitBox
wangbo commented on a change in pull request #1944: Segment v2 use string's 
real length(#1943)
URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333808062
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/segment_writer.cpp
 ##
 @@ -55,17 +55,17 @@ Status SegmentWriter::init(uint32_t write_mbytes_per_sec) {
 bool is_nullable = column.is_nullable();
 column_meta->set_is_nullable(is_nullable);
 
-// TODO(zc): we can add type_info into TabletColumn?
-const TypeInfo* type_info = get_type_info(column.type());
-DCHECK(type_info != nullptr);
-
 ColumnWriterOptions opts;
 opts.compression_type = segment_v2::CompressionTypePB::LZ4F;
 // now we create zone map for key columns
 if (column.is_key()) {
 opts.need_zone_map = true;
 }
-std::unique_ptr writer(new ColumnWriter(opts, type_info, 
is_nullable, _output_file.get()));
+
+Field* field = FieldFactory::create(column);
 
 Review comment:
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)

2019-10-10 Thread GitBox
wangbo commented on a change in pull request #1944: Segment v2 use string's 
real length(#1943)
URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333807936
 
 

 ##
 File path: be/src/olap/field.h
 ##
 @@ -55,9 +55,11 @@ class Field {
 inline size_t field_size() const { return size() + 1; }
 inline size_t index_size() const { return _index_size; }
 
-inline void set_to_max(char* buf) const { return 
_type_info->set_to_max(buf); }
+virtual inline void set_to_max(char* buf) const { return 
_type_info->set_to_max(buf); }
+//todo delete this method when segment v2 release
 
 Review comment:
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)

2019-10-10 Thread GitBox
wangbo commented on a change in pull request #1944: Segment v2 use string's 
real length(#1943)
URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333807981
 
 

 ##
 File path: be/src/olap/field.h
 ##
 @@ -400,6 +411,16 @@ class VarcharField: public Field {
 VarcharField* clone() const override {
 return new VarcharField(*this);
 }
+
+char* allocate_value_from_arena(Arena* arena) const {
 
 Review comment:
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)

2019-10-10 Thread GitBox
wangbo commented on a change in pull request #1944: Segment v2 use string's 
real length(#1943)
URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333807995
 
 

 ##
 File path: be/src/olap/field.h
 ##
 @@ -400,6 +411,16 @@ class VarcharField: public Field {
 VarcharField* clone() const override {
 return new VarcharField(*this);
 }
+
+char* allocate_value_from_arena(Arena* arena) const {
+return Field::allocate_string_value_from_arena(arena);
+}
+
+void set_to_max(char* ch) {
 
 Review comment:
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] wangbo commented on a change in pull request #1944: Segment v2 use string's real length(#1943)

2019-10-10 Thread GitBox
wangbo commented on a change in pull request #1944: Segment v2 use string's 
real length(#1943)
URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333807895
 
 

 ##
 File path: be/src/olap/field.h
 ##
 @@ -55,9 +55,11 @@ class Field {
 inline size_t field_size() const { return size() + 1; }
 inline size_t index_size() const { return _index_size; }
 
-inline void set_to_max(char* buf) const { return 
_type_info->set_to_max(buf); }
+virtual inline void set_to_max(char* buf) const { return 
_type_info->set_to_max(buf); }
+//todo delete this method when segment v2 release
+inline void set_to_max_v1(char* buf) const { return 
_type_info->set_to_max(buf); }
 
 Review comment:
   implement it the same with set_to_max means:
when calling varchar field.set_to_max,using field._length to memset slice's 
data.
but for segment v1, field._length is regarded as 
slice.size+sizeof(StringLengthType)
   So when calling segment v1 varchar field.set_max,the length of memset is 
beyond slice's actually size,this may cause subsequently tcmalloc core 
dumped.see error stack below:
   >#0  SLL_Next (t=0x) at src/linked_list.h:45
   >#1  SLL_TryPop (rv=, list=0x436dce0) at 
src/linked_list.h:69
   >#2  TryPop (rv=, this=0x436dce0) at 
src/thread_cache.h:220
   >#3  Allocate (oom_handler=0x25e5700 , cl=5, size=64, this=) at src/thread_cache.h:379
   >#4  malloc_fast_path (size=64) at 
src/tcmalloc.cc:1855
   >#5  tc_new (size=64) at src/tcmalloc.cc:1976
   >#6  0x00c21672 in doris::WrapperField::WrapperField 
(this=0x4e07d40, rep=, variable_len=, 
   >is_string_type=) at 
/home/wangbo/incubator-doris/be/src/olap/wrapper_field.cpp:84
   >#7  0x00c21b9e in doris::WrapperField::create (column=..., 
len=, len@entry=0)
   >at /home/wangbo/incubator-doris/be/src/olap/wrapper_field.cpp:51
   >#8  0x00cb88e1 in doris::ColumnDataWriter::init (this=0x4bc4780) at 
/home/wangbo/incubator-doris/be/src/olap/rowset/column_data_writer.cpp:72
   >#9  0x00ca273c in doris::AlphaRowsetWriter::_init 
(this=this@entry=0x4bf83c0)
   >at 
/home/wangbo/incubator-doris/be/src/olap/rowset/alpha_rowset_writer.cpp:275
   >#10 0x00ca3694 in doris::AlphaRowsetWriter::init (this=0x4bf83c0, 
rowset_writer_context=...)
   >at 
/home/wangbo/incubator-doris/be/src/olap/rowset/alpha_rowset_writer.cpp:73
   >#11 0x00c91b03 in doris::RowsetFactory::create_rowset_writer 
(context=..., output=output@entry=0x7fffcdf75318)
   >at /home/wangbo/incubator-doris/be/src/olap/rowset/rowset_factory.cpp:50
   >#12 0x00bc9b3f in doris::TabletManager::_create_inital_rowset 
(this=this@entry=0x4d5a180, tablet=..., request=...)
   >at /home/wangbo/incubator-doris/be/src/olap/tablet_manager.cpp:1250
   >#13 0x00bcc919 in doris::TabletManager::_internal_create_tablet 
(this=this@entry=0x4d5a180, alter_type=alter_type@entry=doris::SCHEMA_CHANGE, 
   >request=..., 
is_schema_change_tablet=is_schema_change_tablet@entry=false, ref_tablet=..., 
data_dirs=...)
   >at /home/wangbo/incubator-doris/be/src/olap/tablet_manager.cpp:371
   >#14 0x00bcdd72 in doris::TabletManager::create_tablet 
(this=this@entry=0x4d5a180, request=..., stores=...)
   >at /home/wangbo/incubator-doris/be/src/olap/tablet_manager.cpp:318
   >#15 0x00b82642 in doris::StorageEngine::create_tablet 
(this=, request=...)
   >at /home/wangbo/incubator-doris/be/src/olap/storage_engine.cpp:760
   >#16 0x00b4fac3 in doris::TestDeleteConditionHandler::SetUp 
(this=0x4e08460)
   >at /home/wangbo/incubator-doris/be/test/olap/delete_handler_test.cpp:173
   >#17 0x02530f37 in void 
testing::internal::HandleSehExceptionsInMethodIfSupported(testing::Test*, void (testing::Test::*)(), char const*) ()
   >#18 0x0252ba9c in void 
testing::internal::HandleExceptionsInMethodIfSupported(testing::Test*, void (testing::Test::*)(), char const*) ()
   >#19 0x02510df7 in testing::Test::Run() ()
   >#20 0x025116ce in testing::TestInfo::Run() ()
   >#21 0x02511d25 in testing::TestCase::Run() ()
   >#22 0x02518899 in testing::internal::UnitTestImpl::RunAllTests() ()
   >#23 0x02531eb3 in bool 
testing::internal::HandleSehExceptionsInMethodIfSupported(testing::internal::UnitTestImpl*, bool 
(testing::internal::UnitTestImpl::*)(), char const*) ()
   >#24 0x0252c7a2 in bool 
testing::internal::HandleExceptionsInMethodIfSupported(testing::internal::UnitTestImpl*, bool 
(testing::internal::UnitTestImpl::*)(), char const*) ()
   >#25 0x02517570 in testing::UnitTest::Run() ()
   >#26 0x009fb316 in RUN_ALL_TESTS () at 
/home/wangbo/incubator-doris/thirdparty/installed/include/gtest/gtest.h:2233
   >#27 main (argc=, argv=0x7fffcdf75f88) at 
/home/wangbo/incubator-doris/be/test/olap/delete_handler_test.cpp:951
   
   it seems that memset memory to 0xff may effect tcmalloc 

[GitHub] [incubator-doris] morningman opened a new issue #1951: Limit the memory usage of Loading process

2019-10-10 Thread GitBox
morningman opened a new issue #1951: Limit the memory usage of Loading process
URL: https://github.com/apache/incubator-doris/issues/1951
 
 
   ## Motivation
   
   Current load framework using memtable to receive incoming loading data and 
flush to disk when reaching limit (default is 100MB). 
   Each tablet corresponds to a memtable, so that if there are many tablets on 
a Backend, and the loading data is distributed evenly on each of these tablets, 
the total memory consumption can be very large because all memtables will no be 
flushed until they reaching the 100MB.
   For example, if there are 100 tablets on a Backend, the peak memory 
consumption can be 10GB(100 * 100MB), and this may cause process killed by 
system OOM.
   
   ## How to resolve
   
   There will be 2 levels of memory limit for loading process. One is loading 
process-level, and one is for Backend-level.
   
   1. loading process-level limit
   
   User can set a memory limit for each loading process. And this limit will 
limit the memory 
   consumption of a loading process on one Backend.
   
   From the aspect of code implements, we use a MemTracker to track the memory 
consumption of a loading process on a Backend. If memory reaches the limit. It 
will find the largest memtable
   can flush it to disk to reduce the memory consumption.
   
   2. Backend-level limit 
   
   Each Backend should has a total loading memory limit. This will limit total 
memory consumption
   of all loading processes on this Backend. And if this limit is reached, Load 
Mgr will pick a 
   loading process with largest memory consumption and try to flush its 
memtable.
   
   ## Configuration
   
   For loading process-level limit, the default is 2GB, and user can it by 
session variable or load
   property.
   For Backend-level limit, the default is 60% of physical memory and max to 
20GB.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] yiguolei opened a new pull request #1950: Refactor txn manager methods

2019-10-10 Thread GitBox
yiguolei opened a new pull request #1950: Refactor txn manager methods
URL: https://github.com/apache/incubator-doris/pull/1950
 
 
   Most methods in TxnManager depend on tablet id , schema hash, tablet uid. 
All of them could be extracted from tablet object. So that using tablet as 
parameter instead of tablet_id, schema has, tablet uid.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943)

2019-10-10 Thread GitBox
imay commented on a change in pull request #1944: Segment v2 use string's real 
length(#1943)
URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333596185
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/segment_writer.cpp
 ##
 @@ -55,17 +55,17 @@ Status SegmentWriter::init(uint32_t write_mbytes_per_sec) {
 bool is_nullable = column.is_nullable();
 column_meta->set_is_nullable(is_nullable);
 
-// TODO(zc): we can add type_info into TabletColumn?
-const TypeInfo* type_info = get_type_info(column.type());
-DCHECK(type_info != nullptr);
-
 ColumnWriterOptions opts;
 opts.compression_type = segment_v2::CompressionTypePB::LZ4F;
 // now we create zone map for key columns
 if (column.is_key()) {
 opts.need_zone_map = true;
 }
-std::unique_ptr writer(new ColumnWriter(opts, type_info, 
is_nullable, _output_file.get()));
+
+Field* field = FieldFactory::create(column);
+DCHECK(field != nullptr);
+
+std::unique_ptr writer(new ColumnWriter(opts, field, 
is_nullable, _output_file.get()));
 
 Review comment:
   ```suggestion
   std::unique_ptr writer(new ColumnWriter(opts, 
std::move(field), is_nullable, _output_file.get()));
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943)

2019-10-10 Thread GitBox
imay commented on a change in pull request #1944: Segment v2 use string's real 
length(#1943)
URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333596710
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/column_writer.cpp
 ##
 @@ -72,13 +72,13 @@ class NullBitmapBuilder {
 };
 
 ColumnWriter::ColumnWriter(const ColumnWriterOptions& opts,
-   const TypeInfo* typeinfo,
+   Field* field,
 
 Review comment:
   ```suggestion
  std::unique_ptr&& field,
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943)

2019-10-10 Thread GitBox
imay commented on a change in pull request #1944: Segment v2 use string's real 
length(#1943)
URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333593330
 
 

 ##
 File path: be/src/olap/field.h
 ##
 @@ -55,9 +55,11 @@ class Field {
 inline size_t field_size() const { return size() + 1; }
 inline size_t index_size() const { return _index_size; }
 
-inline void set_to_max(char* buf) const { return 
_type_info->set_to_max(buf); }
+virtual inline void set_to_max(char* buf) const { return 
_type_info->set_to_max(buf); }
+//todo delete this method when segment v2 release
+inline void set_to_max_v1(char* buf) const { return 
_type_info->set_to_max(buf); }
 
 Review comment:
   I think there is no need to keep this function, implement it the same with 
set_to_max is OK. Because the value set is greater than what is set in TypeInfo.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943)

2019-10-10 Thread GitBox
imay commented on a change in pull request #1944: Segment v2 use string's real 
length(#1943)
URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333580905
 
 

 ##
 File path: be/src/olap/field.h
 ##
 @@ -55,9 +55,11 @@ class Field {
 inline size_t field_size() const { return size() + 1; }
 inline size_t index_size() const { return _index_size; }
 
-inline void set_to_max(char* buf) const { return 
_type_info->set_to_max(buf); }
+virtual inline void set_to_max(char* buf) const { return 
_type_info->set_to_max(buf); }
+//todo delete this method when segment v2 release
 
 Review comment:
   ```suggestion
   // TODO(wangbo) delete this method when segment v2 release
   ```
   Should comment why this function stay here to let others know about this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1944: Segment v2 use string's real length(#1943)

2019-10-10 Thread GitBox
imay commented on a change in pull request #1944: Segment v2 use string's real 
length(#1943)
URL: https://github.com/apache/incubator-doris/pull/1944#discussion_r333594065
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/segment_writer.cpp
 ##
 @@ -55,17 +55,17 @@ Status SegmentWriter::init(uint32_t write_mbytes_per_sec) {
 bool is_nullable = column.is_nullable();
 column_meta->set_is_nullable(is_nullable);
 
-// TODO(zc): we can add type_info into TabletColumn?
-const TypeInfo* type_info = get_type_info(column.type());
-DCHECK(type_info != nullptr);
-
 ColumnWriterOptions opts;
 opts.compression_type = segment_v2::CompressionTypePB::LZ4F;
 // now we create zone map for key columns
 if (column.is_key()) {
 opts.need_zone_map = true;
 }
-std::unique_ptr writer(new ColumnWriter(opts, type_info, 
is_nullable, _output_file.get()));
+
+Field* field = FieldFactory::create(column);
 
 Review comment:
   It's not a good way to write code like this, it made me feel that the field 
had been leaked. I think you can use a std::unique_ptr and std::move to finish 
it.
   ```suggestion
   std::unique_ptr field(FieldFactory::create(column));
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1931: Support segment zone map

2019-10-10 Thread GitBox
imay commented on a change in pull request #1931: Support segment zone map
URL: https://github.com/apache/incubator-doris/pull/1931#discussion_r333497324
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/column_zone_map.cpp
 ##
 @@ -73,15 +91,22 @@ Status ColumnZoneMapBuilder::flush() {
 RETURN_IF_ERROR(_page_builder->add((const uint8_t *), ));
 // reset the variables
 // we should allocate max varchar length and set to max for min value
-_reset_zone_map();
+_reset_page_zone_map();
 return Status::OK();
 }
 
-void ColumnZoneMapBuilder::_reset_zone_map() {
-_field->set_to_max(_zone_map.min_value);
-_field->set_to_min(_zone_map.max_value);
-_zone_map.has_null = false;
-_zone_map.has_not_null = false;
+void ColumnZoneMapBuilder::_reset_zone_map(ZoneMap& zone_map) {
+_field->set_to_max(zone_map.min_value);
+_field->set_to_min(zone_map.max_value);
+zone_map.has_null = false;
+zone_map.has_not_null = false;
+}
+
+void ColumnZoneMapBuilder::_fill_zone_map_to_pb(ZoneMap* const from, 
ZoneMapPB* const to) {
 
 Review comment:
   ```suggestion
   void ColumnZoneMapBuilder::_fill_zone_map_to_pb(const ZoneMap& from, 
ZoneMapPB* const to) {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1931: Support segment zone map

2019-10-10 Thread GitBox
imay commented on a change in pull request #1931: Support segment zone map
URL: https://github.com/apache/incubator-doris/pull/1931#discussion_r333496301
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/column_zone_map.cpp
 ##
 @@ -73,15 +91,22 @@ Status ColumnZoneMapBuilder::flush() {
 RETURN_IF_ERROR(_page_builder->add((const uint8_t *), ));
 // reset the variables
 // we should allocate max varchar length and set to max for min value
-_reset_zone_map();
+_reset_page_zone_map();
 return Status::OK();
 }
 
-void ColumnZoneMapBuilder::_reset_zone_map() {
-_field->set_to_max(_zone_map.min_value);
-_field->set_to_min(_zone_map.max_value);
-_zone_map.has_null = false;
-_zone_map.has_not_null = false;
+void ColumnZoneMapBuilder::_reset_zone_map(ZoneMap& zone_map) {
 
 Review comment:
   prefer pointer if param will be changed in function


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1931: Support segment zone map

2019-10-10 Thread GitBox
imay commented on a change in pull request #1931: Support segment zone map
URL: https://github.com/apache/incubator-doris/pull/1931#discussion_r333492306
 
 

 ##
 File path: be/src/olap/rowset/segment_v2/segment.cpp
 ##
 @@ -50,17 +52,69 @@ Status Segment::open() {
 RETURN_IF_ERROR(Env::Default()->new_random_access_file(_fname, 
&_input_file));
 // parse footer to get meta
 RETURN_IF_ERROR(_parse_footer());
-// parse short key index
-RETURN_IF_ERROR(_parse_index());
-// initial all column reader
-RETURN_IF_ERROR(_initial_column_readers());
+
 return Status::OK();
 }
 
-std::unique_ptr Segment::new_iterator(const Schema& schema, 
const StorageReadOptions& read_options) {
-auto it = std::unique_ptr(new 
SegmentIterator(this->shared_from_this(), schema));
-it->init(read_options);
-return it;
+Status Segment::new_iterator(
+const Schema& schema,
+const StorageReadOptions& read_options,
+std::unique_ptr* iter) {
+
+if (read_options.conditions != nullptr) {
+for(auto& column_condition : read_options.conditions->columns()) {
 
 Review comment:
   ```suggestion
   for (auto& column_condition : read_options.conditions->columns()) {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated: Enhance the speed of avg function (#1889)

2019-10-10 Thread zhaoc
This is an automated email from the ASF dual-hosted git repository.

zhaoc pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
 new e267d03  Enhance the speed of avg function (#1889)
e267d03 is described below

commit e267d031bb1402cf7fa32b8196553fd9b1266cb3
Author: EmmyMiao87 <522274...@qq.com>
AuthorDate: Thu Oct 10 22:43:46 2019 +0800

Enhance the speed of avg function (#1889)

This commit enable the avg operator in fe instead of converting the avg 
function into sum/count.
Also, this commit fix the bug of deciamlv2 avg which cause the core in be.
The int128 could not be assinged directly.

The speed of avg function is similar to sum function after enhancement.
---
 be/src/exprs/aggregate_functions.cpp   | 129 +++--
 be/src/exprs/aggregate_functions.h |   2 +
 .../apache/doris/analysis/FunctionCallExpr.java|   5 +-
 .../java/org/apache/doris/analysis/SelectStmt.java |  43 +--
 .../java/org/apache/doris/catalog/FunctionSet.java |   2 +-
 5 files changed, 75 insertions(+), 106 deletions(-)

diff --git a/be/src/exprs/aggregate_functions.cpp 
b/be/src/exprs/aggregate_functions.cpp
index 0a513f3..75de6da 100644
--- a/be/src/exprs/aggregate_functions.cpp
+++ b/be/src/exprs/aggregate_functions.cpp
@@ -177,61 +177,6 @@ void AggregateFunctions::count_remove(
 }
 }
 
-struct AvgState {
-double sum;
-int64_t count;
-};
-
-struct DecimalAvgState {
-DecimalVal sum;
-int64_t count;
-};
-
-struct DecimalV2AvgState {
-DecimalV2Val sum;
-int64_t count;
-};
-
-void AggregateFunctions::avg_init(FunctionContext* ctx, StringVal* dst) {
-dst->is_null = false;
-dst->len = sizeof(AvgState);
-dst->ptr = ctx->allocate(dst->len);
-memset(dst->ptr, 0, sizeof(AvgState));
-}
-
-void AggregateFunctions::decimal_avg_init(FunctionContext* ctx, StringVal* 
dst) {
-dst->is_null = false;
-dst->len = sizeof(DecimalAvgState);
-dst->ptr = ctx->allocate(dst->len);
-// memset(dst->ptr, 0, sizeof(DecimalAvgState));
-DecimalAvgState* avg = reinterpret_cast(dst->ptr);
-avg->count = 0;
-avg->sum.set_to_zero();
-}
-
-void AggregateFunctions::decimalv2_avg_init(FunctionContext* ctx, StringVal* 
dst) {
-dst->is_null = false;
-dst->len = sizeof(DecimalV2AvgState);
-dst->ptr = ctx->allocate(dst->len);
-// memset(dst->ptr, 0, sizeof(DecimalAvgState));
-DecimalV2AvgState* avg = reinterpret_cast(dst->ptr);
-avg->count = 0;
-avg->sum.set_to_zero();
-}
-
-
-template 
-void AggregateFunctions::avg_update(FunctionContext* ctx, const T& src, 
StringVal* dst) {
-if (src.is_null) {
-return;
-}
-DCHECK(dst->ptr != NULL);
-DCHECK_EQ(sizeof(AvgState), dst->len);
-AvgState* avg = reinterpret_cast(dst->ptr);
-avg->sum += src.val;
-++avg->count;
-}
-
 struct PercentileApproxState {
 public:
 PercentileApproxState() : digest(new TDigest()){
@@ -305,6 +250,59 @@ DoubleVal 
AggregateFunctions::percentile_approx_finalize(FunctionContext* ctx, c
 return DoubleVal(result);
 }
 
+struct AvgState {
+double sum = 0;
+int64_t count = 0;
+};
+
+struct DecimalAvgState {
+DecimalVal sum;
+int64_t count;
+};
+
+struct DecimalV2AvgState {
+DecimalV2Val sum;
+int64_t count = 0;
+};
+
+void AggregateFunctions::avg_init(FunctionContext* ctx, StringVal* dst) {
+dst->is_null = false;
+dst->len = sizeof(AvgState);
+dst->ptr = ctx->allocate(dst->len);
+new (dst->ptr) AvgState;
+}
+
+void AggregateFunctions::decimal_avg_init(FunctionContext* ctx, StringVal* 
dst) {
+dst->is_null = false;
+dst->len = sizeof(DecimalAvgState);
+dst->ptr = ctx->allocate(dst->len);
+// memset(dst->ptr, 0, sizeof(DecimalAvgState));
+DecimalAvgState* avg = reinterpret_cast(dst->ptr);
+avg->count = 0;
+avg->sum.set_to_zero();
+}
+
+void AggregateFunctions::decimalv2_avg_init(FunctionContext* ctx, StringVal* 
dst) {
+dst->is_null = false;
+dst->len = sizeof(DecimalV2AvgState);
+// The memroy for int128 need to be aligned by 16.
+// So the constructor has been used instead of allocating memory.
+// Also, it will be release in finalize.
+dst->ptr = (uint8_t*) new DecimalV2AvgState;
+}
+
+template 
+void AggregateFunctions::avg_update(FunctionContext* ctx, const T& src, 
StringVal* dst) {
+if (src.is_null) {
+return;
+}
+DCHECK(dst->ptr != NULL);
+DCHECK_EQ(sizeof(AvgState), dst->len);
+AvgState* avg = reinterpret_cast(dst->ptr);
+avg->sum += src.val;
+++avg->count;
+}
+
 void AggregateFunctions::decimal_avg_update(FunctionContext* ctx,
 const DecimalVal& src,
 StringVal* dst) {
@@ -341,6 +339,15 @@ void 
AggregateFunctions::decimalv2_avg_update(FunctionContext* ctx,
 ++avg->count;
 }
 
+StringVal 

[GitHub] [incubator-doris] imay merged pull request #1889: Enhance the speed of avg function

2019-10-10 Thread GitBox
imay merged pull request #1889: Enhance the speed of avg function
URL: https://github.com/apache/incubator-doris/pull/1889
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] shengyunyao opened a new pull request #1939: Added: Add tdigest compression param for pencentile_approx function

2019-10-10 Thread GitBox
shengyunyao opened a new pull request #1939: Added: Add tdigest compression 
param for pencentile_approx function
URL: https://github.com/apache/incubator-doris/pull/1939
 
 
   Add tdigest compression param for pencentile_approx function. The 
compression param should be in range (0, 1] and we can get a more precise 
result and cost more time when setting the param a bigger value.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] shengyunyao closed pull request #1939: Added: Add tdigest compression param for pencentile_approx function

2019-10-10 Thread GitBox
shengyunyao closed pull request #1939: Added: Add tdigest compression param for 
pencentile_approx function
URL: https://github.com/apache/incubator-doris/pull/1939
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1948: Add object type

2019-10-10 Thread GitBox
imay commented on a change in pull request #1948: Add object type
URL: https://github.com/apache/incubator-doris/pull/1948#discussion_r333428334
 
 

 ##
 File path: gensrc/proto/segment_v2.proto
 ##
 @@ -53,6 +53,7 @@ enum EncodingTypePB {
 RLE = 4;
 DICT_ENCODING = 5;
 BIT_SHUFFLE = 6;
+BINARY_PLAIN_ENCODING = 7;
 
 Review comment:
   Better to add some comment about difference between PLAIN_ENCODING and 
BINARY_PLAIN_ENCODING.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1948: Add object type

2019-10-10 Thread GitBox
imay commented on a change in pull request #1948: Add object type
URL: https://github.com/apache/incubator-doris/pull/1948#discussion_r333454949
 
 

 ##
 File path: fe/src/main/jflex/sql_scanner.flex
 ##
 @@ -218,6 +218,7 @@ import org.apache.doris.common.util.SqlUtils;
 keywordMap.put("not", new Integer(SqlParserSymbols.KW_NOT));
 keywordMap.put("null", new Integer(SqlParserSymbols.KW_NULL));
 keywordMap.put("nulls", new Integer(SqlParserSymbols.KW_NULLS));
+keywordMap.put("object", new Integer(SqlParserSymbols.KW_OBJECT));
 
 Review comment:
   Do we need to expose OBJECT to Doris USER? I think we should expose 
specified type to user, such as HLL, GEO_SHAPE, BITMAP?
   How about adding attribute for column to identify what specified type it is? 
Such as OBJECT("hll"), OBJECT("bitmap")?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli merged pull request #1949: Remove redundant method in rowset meta manager

2019-10-10 Thread GitBox
chaoyli merged pull request #1949: Remove redundant method in rowset meta 
manager
URL: https://github.com/apache/incubator-doris/pull/1949
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated (b72a4a4 -> e4f3e8f)

2019-10-10 Thread lichaoyong
This is an automated email from the ASF dual-hosted git repository.

lichaoyong pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git.


from b72a4a4  Add tablet meta checkpoint mechanism (#1936)
 add e4f3e8f  Remove redundant method in rowset meta manager (#1949)

No new revisions were added by this update.

Summary of changes:
 be/src/olap/data_dir.cpp |  4 +---
 be/src/olap/rowset/rowset_meta_manager.cpp   | 14 +-
 be/src/olap/rowset/rowset_meta_manager.h |  4 +---
 be/src/olap/tablet.cpp   |  4 +++-
 be/src/olap/txn_manager.cpp  | 11 ++-
 be/test/olap/rowset/rowset_meta_manager_test.cpp |  4 +++-
 6 files changed, 19 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] shengyunyao commented on a change in pull request #1939: Added: Add tdigest compression param for pencentile_approx function

2019-10-10 Thread GitBox
shengyunyao commented on a change in pull request #1939: Added: Add tdigest 
compression param for pencentile_approx function
URL: https://github.com/apache/incubator-doris/pull/1939#discussion_r333457738
 
 

 ##
 File path: fe/src/main/java/org/apache/doris/analysis/FunctionCallExpr.java
 ##
 @@ -421,13 +421,23 @@ private void analyzeBuiltinAggFunction(Analyzer 
analyzer) throws AnalysisExcepti
 }
 
 if (fnName.getFunction().equalsIgnoreCase("percentile_approx")) {
-if (children.size() != 2) {
-throw new AnalysisException("percentile_approx(expr, DOUBLE) 
requires two parameters");
+if (children.size() != 2 && children.size() != 3) {
+throw new AnalysisException("percentile_approx(expr, DOUBLE [, 
B]) requires two or three parameters");
 }
 if (!getChild(1).isConstant()) {
 throw new AnalysisException("percentile_approx requires second 
parameter must be a constant : "
 + this.toSql());
 }
+if (children.size() == 3) {
+if (!getChild(2).isConstant()) {
+throw new AnalysisException("percentile_approx requires 
the third parameter must be a constant : "
++ this.toSql());
+}
+double param_3 = Expr.getConstFromExpr(getChild(2));
 
 Review comment:
   Get it


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1889: Enhance the speed of avg function

2019-10-10 Thread GitBox
imay commented on a change in pull request #1889: Enhance the speed of avg 
function
URL: https://github.com/apache/incubator-doris/pull/1889#discussion_r333415220
 
 

 ##
 File path: be/src/exprs/aggregate_functions.cpp
 ##
 @@ -305,6 +250,59 @@ DoubleVal 
AggregateFunctions::percentile_approx_finalize(FunctionContext* ctx, c
 return DoubleVal(result);
 }
 
+struct AvgState {
+double sum = 0;
+int64_t count = 0;
+};
+
+struct DecimalAvgState {
+DecimalVal sum;
+int64_t count;
+};
+
+struct DecimalV2AvgState {
+DecimalV2Val sum;
+int64_t count = 0;
+};
+
+void AggregateFunctions::avg_init(FunctionContext* ctx, StringVal* dst) {
+dst->is_null = false;
+dst->len = sizeof(AvgState);
+dst->ptr = ctx->allocate(dst->len);
+dst->ptr = (uint8_t*) new (dst->ptr) AvgState;
 
 Review comment:
   ```suggestion
   new (dst->ptr) AvgState;
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1889: Enhance the speed of avg function

2019-10-10 Thread GitBox
imay commented on a change in pull request #1889: Enhance the speed of avg 
function
URL: https://github.com/apache/incubator-doris/pull/1889#discussion_r333416274
 
 

 ##
 File path: be/src/exprs/aggregate_functions.cpp
 ##
 @@ -341,6 +339,17 @@ void 
AggregateFunctions::decimalv2_avg_update(FunctionContext* ctx,
 ++avg->count;
 }
 
+StringVal AggregateFunctions::decimalv2_avg_serialize(
+FunctionContext* ctx, const StringVal& src) {
+if (src.is_null) {
 
 Review comment:
   Do we need this if?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] yiguolei opened a new pull request #1949: Remove redundant method in rowset meta manager

2019-10-10 Thread GitBox
yiguolei opened a new pull request #1949: Remove redundant method in rowset 
meta manager
URL: https://github.com/apache/incubator-doris/pull/1949
 
 
   There are two save rowset method in rowset meta manager using different 
parameters. One is rowset meta and the other is string. Most of the code is 
same, and it is very difficult to maintain in ECON mode, so that I remove them 
and using the RowsetMetaPB.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1930: Replace std::regex with RE2

2019-10-10 Thread GitBox
imay commented on a change in pull request #1930: Replace std::regex with RE2
URL: https://github.com/apache/incubator-doris/pull/1930#discussion_r77465
 
 

 ##
 File path: be/test/olap/tablet_mgr_test.cpp
 ##
 @@ -209,6 +209,21 @@ TEST_F(TabletMgrTest, DropTablet) {
 ASSERT_TRUE(!dir_exist);
 }
 
+TEST_F(TabletMgrTest, GetRowsetId) {
+std::string path = _engine_data_path + 
"/data/0/15007/368169781/020100020003_0_0.dat";
+TTabletId tid;
+TSchemaHash schema_hash;
+ASSERT_TRUE(_tablet_mgr.get_tablet_id_and_schema_hash_from_path(path, 
, _hash));
+ASSERT_EQ(15007, tid);
+ASSERT_EQ(368169781, schema_hash);
 
 Review comment:
   OK


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] yiguolei commented on a change in pull request #1930: Replace std::regex with RE2

2019-10-10 Thread GitBox
yiguolei commented on a change in pull request #1930: Replace std::regex with 
RE2
URL: https://github.com/apache/incubator-doris/pull/1930#discussion_r76050
 
 

 ##
 File path: be/test/olap/tablet_mgr_test.cpp
 ##
 @@ -209,6 +209,21 @@ TEST_F(TabletMgrTest, DropTablet) {
 ASSERT_TRUE(!dir_exist);
 }
 
+TEST_F(TabletMgrTest, GetRowsetId) {
+std::string path = _engine_data_path + 
"/data/0/15007/368169781/020100020003_0_0.dat";
+TTabletId tid;
+TSchemaHash schema_hash;
+ASSERT_TRUE(_tablet_mgr.get_tablet_id_and_schema_hash_from_path(path, 
, _hash));
+ASSERT_EQ(15007, tid);
+ASSERT_EQ(368169781, schema_hash);
 
 Review comment:
   should add some bad case such as the path is not a rowset path.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] yiguolei commented on a change in pull request #1930: Replace std::regex with RE2

2019-10-10 Thread GitBox
yiguolei commented on a change in pull request #1930: Replace std::regex with 
RE2
URL: https://github.com/apache/incubator-doris/pull/1930#discussion_r75444
 
 

 ##
 File path: be/src/olap/tablet_manager.cpp
 ##
 @@ -682,42 +682,19 @@ TabletSharedPtr TabletManager::get_tablet(TTabletId 
tablet_id, SchemaHash schema
 return nullptr;
 } // get_tablet
 
-bool TabletManager::get_tablet_id_and_schema_hash_from_path(const std::string& 
path,
-TTabletId* tablet_id, TSchemaHash* schema_hash) {
-std::vector data_dirs = 
StorageEngine::instance()->get_stores();
-for (auto data_dir : data_dirs) {
-const std::string& data_dir_path = data_dir->path();
-if (path.find(data_dir_path) != std::string::npos) {
-std::string pattern = data_dir_path + "/data/\\d+/(\\d+)/?(\\d+)?";
-std::regex rgx (pattern.c_str());
-std::smatch sm;
-bool ret = std::regex_search(path, sm, rgx);
-if (ret) {
-if (sm.size() == 3) {
-*tablet_id = std::strtoll(sm.str(1).c_str(), nullptr, 10);
-*schema_hash = std::strtoll(sm.str(2).c_str(), nullptr, 
10);
-return true;
-} else {
-LOG(WARNING) << "invalid match. match size:" << sm.size();
-return false;
-}
-}
-}
-}
-return false;
+bool TabletManager::get_tablet_id_and_schema_hash_from_path(
+const std::string& path, TTabletId* tablet_id, TSchemaHash* 
schema_hash) {
+static re2::LazyRE2 re = {"/data/\\d+/(\\d+)/(\\d+)"};
+return RE2::PartialMatch(path, *re, tablet_id, schema_hash);
 
 Review comment:
   Should check if the path is DORIS's data path.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay closed issue #1925: Remove CR in source code file and add normalize when check in

2019-10-10 Thread GitBox
imay closed issue #1925: Remove CR in source code file and add normalize when 
check in
URL: https://github.com/apache/incubator-doris/issues/1925
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org