[GitHub] [incubator-doris] imay opened a new issue #1837: Reduce the number of rowset of a table who has HLL column

2019-09-19 Thread GitBox
imay opened a new issue #1837: Reduce the number of rowset of a table who has 
HLL column
URL: https://github.com/apache/incubator-doris/issues/1837
 
 
   When we load data to table who contains HLL columns. Doris will generate 
many small rowsets whose size is about 100KB, then there will be two many small 
files.
   
   Because a Memtable is 100MB in size and a HLL column is 16KB, a rowset can 
only contain thousands of rows of data. However, in the import process, in 
fact, the HLL column does not use such a large amount of data, generally only a 
few items, and does not need 16KB. So we need to optimize the HLLContext so 
that the memory used by it is reduced, which increases the size of the Rowset 
and thus the number of rowsets.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated: Remove unused debug (#1836)

2019-09-19 Thread zhaoc
This is an automated email from the ASF dual-hosted git repository.

zhaoc pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
 new abd27df  Remove unused debug (#1836)
abd27df is described below

commit abd27dfcca2a9063317126634df2c1a407af621b
Author: ZHAO Chun 
AuthorDate: Fri Sep 20 09:31:56 2019 +0800

Remove unused debug (#1836)
---
 be/src/runtime/bufferpool/reservation_tracker.cc | 4 
 1 file changed, 4 deletions(-)

diff --git a/be/src/runtime/bufferpool/reservation_tracker.cc 
b/be/src/runtime/bufferpool/reservation_tracker.cc
index 2adf1f1..7fcc2bd 100644
--- a/be/src/runtime/bufferpool/reservation_tracker.cc
+++ b/be/src/runtime/bufferpool/reservation_tracker.cc
@@ -287,14 +287,12 @@ bool 
ReservationTracker::TransferReservationTo(ReservationTracker* other, int64_
 bool success = tracker->TryConsumeFromMemTracker(bytes);
 DCHECK(success);
 if (tracker != other_path_to_common[0]) tracker->child_reservations_ += 
bytes;
-tracker->DebugString();
   }
   
   for (ReservationTracker* tracker : path_to_common) {
 if (tracker != path_to_common[0]) tracker->child_reservations_ -= bytes;
 tracker->UpdateReservation(-bytes);
 tracker->ReleaseToMemTracker(bytes);
-tracker->DebugString();
   }
 
   // Update the 'child_reservations_' on the common ancestor if needed.
@@ -302,14 +300,12 @@ bool 
ReservationTracker::TransferReservationTo(ReservationTracker* other, int64_
   if (common_ancestor == other) {
 lock_guard l(other->lock_);
 other->child_reservations_ -= bytes;
-other->DebugString();
 other->CheckConsistency();
   }
   // Case 2: reservation was pushed down below 'this'.
   if (common_ancestor == this) {
 lock_guard l(lock_);
 child_reservations_ += bytes;
-DebugString();
 CheckConsistency();
   }
   return true;


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay merged pull request #1836: Remove unused debug

2019-09-19 Thread GitBox
imay merged pull request #1836: Remove unused debug
URL: https://github.com/apache/incubator-doris/pull/1836
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay merged pull request #1831: Support setting timezone for stream load and routine load

2019-09-19 Thread GitBox
imay merged pull request #1831: Support setting timezone for stream load and 
routine load
URL: https://github.com/apache/incubator-doris/pull/1831
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated: Support setting timezone for stream load and routine load (#1831)

2019-09-19 Thread zhaoc
This is an automated email from the ASF dual-hosted git repository.

zhaoc pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
 new e8da855  Support setting timezone for stream load and routine load 
(#1831)
e8da855 is described below

commit e8da855cd2e680a4e2e4f84f7f7dfd035b7aab05
Author: Mingyu Chen 
AuthorDate: Fri Sep 20 07:55:05 2019 +0800

Support setting timezone for stream load and routine load (#1831)
---
 be/src/http/action/stream_load.cpp |   3 +
 be/src/http/http_common.h  |   1 +
 .../Data Manipulation/ROUTINE LOAD.md  |  10 +-
 .../Data Manipulation/STREAM LOAD.md   |   6 +-
 .../Data Manipulation/ROUTINE LOAD_EN.md   | 402 -
 .../Data Manipulation/STREAM LOAD_EN.md| 194 ++
 .../doris/analysis/CreateRoutineLoadStmt.java  |  18 +-
 .../doris/load/routineload/RoutineLoadJob.java |   9 +
 .../apache/doris/planner/StreamLoadPlanner.java|   2 +
 .../java/org/apache/doris/task/StreamLoadTask.java |  11 +
 gensrc/thrift/FrontendService.thrift   |   1 +
 11 files changed, 426 insertions(+), 231 deletions(-)

diff --git a/be/src/http/action/stream_load.cpp 
b/be/src/http/action/stream_load.cpp
index 38cec25..270fc7d 100644
--- a/be/src/http/action/stream_load.cpp
+++ b/be/src/http/action/stream_load.cpp
@@ -344,6 +344,9 @@ Status StreamLoadAction::_process_put(HttpRequest* 
http_req, StreamLoadContext*
 return Status::InvalidArgument("Invalid strict mode format. Must 
be bool type");
 }
 }
+if (!http_req->header(HTTP_TIMEZONE).empty()) {
+request.__set_timezone(http_req->header(HTTP_TIMEZONE));
+}
 
 // plan this load
 TNetworkAddress master_addr = _exec_env->master_info()->network_address;
diff --git a/be/src/http/http_common.h b/be/src/http/http_common.h
index e7e0b8d..1875e1c 100644
--- a/be/src/http/http_common.h
+++ b/be/src/http/http_common.h
@@ -33,6 +33,7 @@ static const std::string HTTP_TIMEOUT = "timeout";
 static const std::string HTTP_PARTITIONS = "partitions";
 static const std::string HTTP_NEGATIVE = "negative";
 static const std::string HTTP_STRICT_MODE = "strict_mode";
+static const std::string HTTP_TIMEZONE = "timezone";
 
 static const std::string HTTP_100_CONTINUE = "100-continue";
 
diff --git a/docs/documentation/cn/sql-reference/sql-statements/Data 
Manipulation/ROUTINE LOAD.md 
b/docs/documentation/cn/sql-reference/sql-statements/Data Manipulation/ROUTINE 
LOAD.md
index 20315ad..9b847e4 100644
--- a/docs/documentation/cn/sql-reference/sql-statements/Data 
Manipulation/ROUTINE LOAD.md  
+++ b/docs/documentation/cn/sql-reference/sql-statements/Data 
Manipulation/ROUTINE LOAD.md  
@@ -116,6 +116,10 @@
 
 是否开启严格模式,默认为开启。如果开启后,非空原始数据的列类型变换如果结果为 NULL,则会被过滤。指定方式为 
"strict_mode" = "true"
 
+5. timezone
+
+指定导入作业所使用的时区。默认为使用 Session 的 timezone 参数。该参数会影响所有导入涉及的和时区有关的函数结果。
+
 5. data_source
 
 数据源的类型。当前支持:
@@ -161,6 +165,7 @@
 
 "kafka_partitions" = "0,1,2,3",
 "kafka_offsets" = "101,0,OFFSET_BEGINNING,OFFSET_END" 
+
 4. property
 
 指定自定义kafka参数。
@@ -232,7 +237,7 @@
 "kafka_offsets" = "101,0,0,200"
 );
 
-2. 通过 SSL 认证方式,从 Kafka 集群导入数据。同时设置 client.id 参数。导入任务为非严格模式
+2. 通过 SSL 认证方式,从 Kafka 集群导入数据。同时设置 client.id 参数。导入任务为非严格模式,时区为 
Africa/Abidjan
 
 CREATE ROUTINE LOAD example_db.test1 ON example_tbl
 COLUMNS(k1, k2, k3, v1, v2, v3 = k1 * 100),
@@ -243,7 +248,8 @@
 "max_batch_interval" = "20",
 "max_batch_rows" = "30",
 "max_batch_size" = "209715200",
-"strict_mode" = "false"
+"strict_mode" = "false",
+"timezone" = "Africa/Abidjan"
 )
 FROM KAFKA
 (
diff --git a/docs/documentation/cn/sql-reference/sql-statements/Data 
Manipulation/STREAM LOAD.md 
b/docs/documentation/cn/sql-reference/sql-statements/Data Manipulation/STREAM 
LOAD.md
index 55f569b..f1d02d6 100644
--- a/docs/documentation/cn/sql-reference/sql-statements/Data 
Manipulation/STREAM LOAD.md   
+++ b/docs/documentation/cn/sql-reference/sql-statements/Data 
Manipulation/STREAM LOAD.md   
@@ -44,6 +44,8 @@
 
 strict_mode: 用户指定此次导入是否开启严格模式,默认为开启。关闭方式为 -H "strict_mode: false"。
 
+timezone: 指定本次导入所使用的时区。默认为东八区。该参数会影响所有导入涉及的和时区有关的函数结果。
+
 RETURN VALUES
 导入完成后,会以Json格式返回这次导入的相关内容。当前包括一下字段
 Status: 导入最后的状态。
@@ -91,8 +93,8 @@
 7. 导入含有HLL列的表,可以是表中的列或者数据中的列用于生成HLL列
 curl --location-trusted -u root -H "columns: k1, k2, v1=hll_hash(k1)" 
-T testData http://host:port/api/testDb/testTbl/_stream_load
 
-8. 导入数据进行严格模式过滤
-curl --location-trusted -u root -H "strict_mode: true" 

[GitHub] [incubator-doris] imay merged pull request #1832: Fix bug that routine load may mistakenly skipped some data

2019-09-19 Thread GitBox
imay merged pull request #1832: Fix bug that routine load may mistakenly 
skipped some data
URL: https://github.com/apache/incubator-doris/pull/1832
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated: Fix bug that routine load may mistakenly skipped some data (#1832)

2019-09-19 Thread zhaoc
This is an automated email from the ASF dual-hosted git repository.

zhaoc pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 7bf02d0  Fix bug that routine load may mistakenly skipped some data 
(#1832)
7bf02d0 is described below

commit 7bf02d0ae7ce8931511ae3aef758b0ae0d682de4
Author: Mingyu Chen 
AuthorDate: Fri Sep 20 07:54:11 2019 +0800

Fix bug that routine load may mistakenly skipped some data (#1832)

Reproduce:
1. start a routine load, send a routine load task to BE
2. BE executes task successfully and commit to FE.
3. Commit request failed on FE because database is renamed(throw db not 
found exception)
4. After commit failed, BE will send rollback request to FE.
5. FE receive this rollback request and mistakenly update the routine load 
progress,
   because the number of loaded rows in this rollback request's attachment 
is larger than 0
---
 .../load/routineload/KafkaRoutineLoadJob.java  | 30 ++
 .../doris/load/routineload/RoutineLoadJob.java |  5 ++--
 2 files changed, 23 insertions(+), 12 deletions(-)

diff --git 
a/fe/src/main/java/org/apache/doris/load/routineload/KafkaRoutineLoadJob.java 
b/fe/src/main/java/org/apache/doris/load/routineload/KafkaRoutineLoadJob.java
index b60a21c..d89f93f 100644
--- 
a/fe/src/main/java/org/apache/doris/load/routineload/KafkaRoutineLoadJob.java
+++ 
b/fe/src/main/java/org/apache/doris/load/routineload/KafkaRoutineLoadJob.java
@@ -37,6 +37,7 @@ import org.apache.doris.common.util.LogKey;
 import org.apache.doris.common.util.SmallFileMgr;
 import org.apache.doris.common.util.SmallFileMgr.SmallFile;
 import org.apache.doris.system.SystemInfoService;
+import org.apache.doris.transaction.TransactionStatus;
 
 import com.google.common.base.Joiner;
 import com.google.common.collect.Lists;
@@ -56,8 +57,6 @@ import java.util.List;
 import java.util.Map;
 import java.util.UUID;
 
-import static 
org.apache.doris.analysis.CreateRoutineLoadStmt.KAFKA_DEFAULT_OFFSETS;
-
 /**
  * KafkaRoutineLoadJob is a kind of RoutineLoadJob which fetch data from kafka.
  * The progress which is super class property is seems like "{"partition1": 
offset1, "partition2": offset2}"
@@ -132,8 +131,8 @@ public class KafkaRoutineLoadJob extends RoutineLoadJob {
 convertedCustomProperties.put(entry.getKey(), 
entry.getValue());
 }
 }
-if (convertedCustomProperties.containsKey(KAFKA_DEFAULT_OFFSETS)) {
-kafkaDefaultOffSet = 
convertedCustomProperties.remove(KAFKA_DEFAULT_OFFSETS);
+if 
(convertedCustomProperties.containsKey(CreateRoutineLoadStmt.KAFKA_DEFAULT_OFFSETS))
 {
+kafkaDefaultOffSet = 
convertedCustomProperties.remove(CreateRoutineLoadStmt.KAFKA_DEFAULT_OFFSETS);
 }
 }
 
@@ -189,15 +188,26 @@ public class KafkaRoutineLoadJob extends RoutineLoadJob {
 return currentTaskConcurrentNum;
 }
 
-// partitionIdToOffset must be not empty when loaded rows > 0
-// situation1: be commit txn but fe throw error when committing txn,
-// fe rollback txn without partitionIdToOffset by itself
-// this task should not be commit
-// otherwise currentErrorNum and currentTotalNum is updated 
when progress is not updated
+// case1: BE execute the task successfully and commit it to FE, but failed 
on FE(such as db renamed, not found),
+//after commit failed, BE try to rollback this txn, and loaded 
rows in its attachment is larger than 0.
+//In this case, FE should not update the progress.
+//
+// case2: partitionIdToOffset must be not empty when loaded rows > 0
+//be commit txn but fe throw error when committing txn,
+//fe rollback txn without partitionIdToOffset by itself
+//this task should not be commit
+//otherwise currentErrorNum and currentTotalNum is updated when 
progress is not updated
 @Override
-protected boolean checkCommitInfo(RLTaskTxnCommitAttachment 
rlTaskTxnCommitAttachment) {
+protected boolean checkCommitInfo(RLTaskTxnCommitAttachment 
rlTaskTxnCommitAttachment,
+TransactionStatus txnStatus) {
+if (rlTaskTxnCommitAttachment.getLoadedRows() > 0 && txnStatus == 
TransactionStatus.ABORTED) {
+// case 1
+return false;
+}
+
 if (rlTaskTxnCommitAttachment.getLoadedRows() > 0
 && (!((KafkaProgress) 
rlTaskTxnCommitAttachment.getProgress()).hasPartition())) {
+// case 2
 LOG.warn(new LogBuilder(LogKey.ROUTINE_LOAD_TASK, 
DebugUtil.printId(rlTaskTxnCommitAttachment.getTaskId()))
  .add("job_id", id)
  .add("loaded_rows", 
rlTaskTxnCommitAttachment.getLoadedRows())
diff --git 

[GitHub] [incubator-doris] imay merged pull request #1833: Remove config::max_file_descriptor_number

2019-09-19 Thread GitBox
imay merged pull request #1833: Remove config::max_file_descriptor_number
URL: https://github.com/apache/incubator-doris/pull/1833
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated: Remove config::max_file_descriptor_number (#1833)

2019-09-19 Thread zhaoc
This is an automated email from the ASF dual-hosted git repository.

zhaoc pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 720808f  Remove config::max_file_descriptor_number (#1833)
720808f is described below

commit 720808fda5990f4805e10f86fbfb22ee44c30781
Author: lichaoyong 
AuthorDate: Fri Sep 20 07:50:57 2019 +0800

Remove config::max_file_descriptor_number (#1833)
---
 be/src/common/config.h   | 5 ++---
 be/src/olap/storage_engine.cpp   | 9 -
 be/test/olap/delete_handler_test.cpp | 4 +---
 3 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/be/src/common/config.h b/be/src/common/config.h
index daa36d4..b386d21 100644
--- a/be/src/common/config.h
+++ b/be/src/common/config.h
@@ -221,10 +221,9 @@ namespace config {
 CONF_Bool(row_nums_check, "true")
 //file descriptors cache, by default, cache 32768 descriptors
 CONF_Int32(file_descriptor_cache_capacity, "32768");
-// minimum/maximum file descriptor number
+// minimum file descriptor number
 // modify them upon necessity
-CONF_Int32(min_file_descriptor_number, "65536");
-CONF_Int32(max_file_descriptor_number, "131072");
+CONF_Int32(min_file_descriptor_number, "6");
 CONF_Int64(index_stream_cache_capacity, "10737418240");
 CONF_Int64(max_packed_row_block_size, "20971520");
 
diff --git a/be/src/olap/storage_engine.cpp b/be/src/olap/storage_engine.cpp
index 35e2f79..11ed82d 100644
--- a/be/src/olap/storage_engine.cpp
+++ b/be/src/olap/storage_engine.cpp
@@ -187,9 +187,9 @@ OLAPStatus StorageEngine::open() {
 
 res = _check_file_descriptor_number();
 if (res != OLAP_SUCCESS) {
-LOG(WARNING) << "file descriptor number is not between "
- << "min_file_descriptor_number:" << 
config::min_file_descriptor_number
- << " and max_file_descriptor_number:" << 
config::max_file_descriptor_number;
+LOG(ERROR) << "File descriptor number is less than " << 
config::min_file_descriptor_number
+   << ". Please use (ulimit -n) to set a value equal or 
greater than "
+   << config::min_file_descriptor_number;
 return OLAP_ERR_INIT_FAILED;
 }
 
@@ -370,8 +370,7 @@ OLAPStatus StorageEngine::_check_file_descriptor_number() {
  << ", use default configuration instead.";
 return OLAP_SUCCESS;
 }
-if (l.rlim_cur < config::min_file_descriptor_number
-|| l.rlim_cur > config::max_file_descriptor_number) {
+if (l.rlim_cur < config::min_file_descriptor_number) {
 return OLAP_ERR_TOO_FEW_FILE_DESCRITPROR;
 }
 return OLAP_SUCCESS;
diff --git a/be/test/olap/delete_handler_test.cpp 
b/be/test/olap/delete_handler_test.cpp
index fa062c0..4daa740 100644
--- a/be/test/olap/delete_handler_test.cpp
+++ b/be/test/olap/delete_handler_test.cpp
@@ -53,9 +53,7 @@ void set_up() {
 create_dir(config::storage_root_path);
 std::vector paths;
 paths.emplace_back(config::storage_root_path, -1);
-
-config::min_file_descriptor_number = 65536;
-config::max_file_descriptor_number = 131072;
+config::min_file_descriptor_number = 1000;
 
 doris::EngineOptions options;
 options.store_paths = paths;


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay opened a new pull request #1836: Remove unused debug

2019-09-19 Thread GitBox
imay opened a new pull request #1836: Remove unused debug
URL: https://github.com/apache/incubator-doris/pull/1836
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] kangpinghuang opened a new pull request #1835: add default value column iterator #1834

2019-09-19 Thread GitBox
kangpinghuang opened a new pull request #1835: add default value column 
iterator #1834
URL: https://github.com/apache/incubator-doris/pull/1835
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] kangpinghuang opened a new issue #1834: add default value column iterator for segment v2

2019-09-19 Thread GitBox
kangpinghuang opened a new issue #1834: add default value column iterator for 
segment v2
URL: https://github.com/apache/incubator-doris/issues/1834
 
 
   Doris supports online schema change. So the tablet schema may change and be 
different with segment file's schema. Now Doris realize linked schema change, 
which will support add simple column with default value without changing 
segment file. So SegmentV2's reader should support default value column reader.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli commented on a change in pull request #1833: Remove config::max_file_descriptor_number

2019-09-19 Thread GitBox
chaoyli commented on a change in pull request #1833: Remove 
config::max_file_descriptor_number
URL: https://github.com/apache/incubator-doris/pull/1833#discussion_r326152928
 
 

 ##
 File path: be/test/olap/delete_handler_test.cpp
 ##
 @@ -53,9 +53,7 @@ void set_up() {
 create_dir(config::storage_root_path);
 std::vector paths;
 paths.emplace_back(config::storage_root_path, -1);
-
 config::min_file_descriptor_number = 65536;
 
 Review comment:
   OK


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1833: Remove config::max_file_descriptor_number

2019-09-19 Thread GitBox
imay commented on a change in pull request #1833: Remove 
config::max_file_descriptor_number
URL: https://github.com/apache/incubator-doris/pull/1833#discussion_r326150259
 
 

 ##
 File path: be/test/olap/delete_handler_test.cpp
 ##
 @@ -53,9 +53,7 @@ void set_up() {
 create_dir(config::storage_root_path);
 std::vector paths;
 paths.emplace_back(config::storage_root_path, -1);
-
 config::min_file_descriptor_number = 65536;
 
 Review comment:
   This is UT, change it to a smaller one, 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli opened a new pull request #1833: Remove config::max_file_descriptor_number

2019-09-19 Thread GitBox
chaoyli opened a new pull request #1833: Remove 
config::max_file_descriptor_number
URL: https://github.com/apache/incubator-doris/pull/1833
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman opened a new pull request #1832: Fix bug that routine load may mistakenly skipped some data

2019-09-19 Thread GitBox
morningman opened a new pull request #1832: Fix bug that routine load may 
mistakenly skipped some data
URL: https://github.com/apache/incubator-doris/pull/1832
 
 
   Reproduce:
   1. start a routine load, send a routine load task to BE
   2. BE executes task successfully and commit to FE.
   3. Commit request failed on FE because database is renamed(throw db not 
found exception)
   4. After commit failed, BE will send rollback request to FE.
   5. FE receive this rollback request and mistakenly update the routine load 
progress,
  because the number of loaded rows in this rollback request's attachment 
is larger than 0


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated (aaabf97 -> 315f762)

2019-09-19 Thread lichaoyong
This is an automated email from the ASF dual-hosted git repository.

lichaoyong pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git.


from aaabf97  Split channel close operation into two phase (#1830)
 add 315f762  Seek block when starts a ScanKey (#1828)

No new revisions were added by this update.

Summary of changes:
 be/src/common/config.h|  4 
 be/src/olap/rowset/segment_reader.cpp | 15 ---
 be/src/olap/rowset/segment_reader.h   |  4 ++--
 be/test/olap/delete_handler_test.cpp  |  3 +++
 4 files changed, 17 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli merged pull request #1828: Seek block when starts a ScanKey

2019-09-19 Thread GitBox
chaoyli merged pull request #1828: Seek block when starts a ScanKey
URL: https://github.com/apache/incubator-doris/pull/1828
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1798: Optimize the load performance for large file

2019-09-19 Thread GitBox
imay commented on a change in pull request #1798: Optimize the load performance 
for large file
URL: https://github.com/apache/incubator-doris/pull/1798#discussion_r326104070
 
 

 ##
 File path: be/src/runtime/memtable_flush_executor.cpp
 ##
 @@ -0,0 +1,128 @@
+// Licensed to the Apache Software Foundation (ASF) under one
 
 Review comment:
   we have two executor?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1798: Optimize the load performance for large file

2019-09-19 Thread GitBox
imay commented on a change in pull request #1798: Optimize the load performance 
for large file
URL: https://github.com/apache/incubator-doris/pull/1798#discussion_r326102477
 
 

 ##
 File path: be/src/runtime/memtable_flush_executor.h
 ##
 @@ -0,0 +1,85 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "olap/olap_define.h"
+#include "util/blocking_queue.hpp"
+#include "util/spinlock.h"
+#include "util/thread_pool.hpp"
+
+namespace doris {
+
+class ExecEnv;
+class DeltaWriter;
+class MemTable;
+
+// The context for a memtable to be flushed.
+// It does not own any objects in it.
+struct MemTableFlushContext {
+std::shared_ptr memtable;
+DeltaWriter* delta_writer;
+std::atomic* flush_status;
+};
+
+// MemTableFlushExecutor is for flushing memtables to disk.
+// Each data directory has a specified number of worker threads and a 
corresponding number of flush queues.
+// Each worker thread only takes memtable from the corresponding flush queue 
and writes it to disk.
+class MemTableFlushExecutor {
+public:
+MemTableFlushExecutor(ExecEnv* exec_env);
+// init should be called after storage engine is opened,
+// because it needs path hash of each data dir.
+void init();
+
+~MemTableFlushExecutor();
+
+// given the path hash, return the next idx of flush queue.
+// eg.
+// path A is mapped to idx 0 and 1, so each time get_queue_idx(A) is 
called,
+// 0 and 1 will returned alternately.
+int32_t get_queue_idx(size_t path_hash);
+
+// push the memtable to specified flush queue, and return a future
+std::future push_memtable(int32_t queue_idx, const 
MemTableFlushContext& ctx);
 
 Review comment:
   Better to assign queue in this class, not outside this class.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1798: Optimize the load performance for large file

2019-09-19 Thread GitBox
imay commented on a change in pull request #1798: Optimize the load performance 
for large file
URL: https://github.com/apache/incubator-doris/pull/1798#discussion_r326090612
 
 

 ##
 File path: be/src/runtime/tablets_channel.h
 ##
 @@ -0,0 +1,122 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include 
+#include 
+#include 
+#include 
+
+#include "runtime/descriptors.h"
+#include "runtime/mem_tracker.h"
+#include "util/bitmap.h"
+#include "util/thread_pool.hpp"
+
+#include "gen_cpp/Types_types.h"
+#include "gen_cpp/PaloInternalService_types.h"
+#include "gen_cpp/internal_service.pb.h"
+
+namespace doris {
+
+struct TabletsChannelKey {
+UniqueId id;
+int64_t index_id;
+
+TabletsChannelKey(const PUniqueId& pid, int64_t index_id_)
+: id(pid), index_id(index_id_) { }
+
+~TabletsChannelKey() noexcept { }
+
+bool operator==(const TabletsChannelKey& rhs) const noexcept {
+return index_id == rhs.index_id && id == rhs.id;
+}
+
+std::string to_string() const;
+};
+
+struct TabletsChannelKeyHasher {
+std::size_t operator()(const TabletsChannelKey& key) const {
+size_t seed = key.id.hash();
+return doris::HashUtil::hash(_id, sizeof(key.index_id), 
seed);
+}
+};
+
+class DeltaWriter;
+class MemTable;
+class MemTableFlushExecutor;
+class OlapTableSchemaParam;
+
+// channel that process all data for this load
+class TabletsChannel {
+public:
+TabletsChannel(const TabletsChannelKey& key, MemTableFlushExecutor* 
flush_executor);
+
+~TabletsChannel();
+
+Status open(const PTabletWriterOpenRequest& params);
+
+Status add_batch(const PTabletWriterAddBatchRequest& batch);
+
+Status close(int sender_id, bool* finished,
+const google::protobuf::RepeatedField& partition_ids,
+google::protobuf::RepeatedPtrField* tablet_vec);
+
+time_t last_updated_time() {
+return _last_updated_time;
+}
+
+private:
+// open all writer
+Status _open_all_writers(const PTabletWriterOpenRequest& params);
+
+private:
+// id of this load channel
+TabletsChannelKey _key;
+MemTableFlushExecutor* _flush_executor;
 
 Review comment:
   Better not to store this in TabletChannel, this is storage engine internal.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1798: Optimize the load performance for large file

2019-09-19 Thread GitBox
imay commented on a change in pull request #1798: Optimize the load performance 
for large file
URL: https://github.com/apache/incubator-doris/pull/1798#discussion_r326082656
 
 

 ##
 File path: be/src/common/config.h
 ##
 @@ -465,6 +465,8 @@ namespace config {
 CONF_Int32(storage_flood_stage_usage_percent, "95");// 95%
 // The min bytes that should be left of a data dir
 CONF_Int64(storage_flood_stage_left_capacity_bytes, "1073741824")   // 1GB
+// number of thread for flushing memtable per data dir
+CONF_Int32(flush_thread_num_per_dir, "2");
 
 Review comment:
   ```suggestion
   CONF_Int32(flush_thread_num_per_store, "2");
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1798: Optimize the load performance for large file

2019-09-19 Thread GitBox
imay commented on a change in pull request #1798: Optimize the load performance 
for large file
URL: https://github.com/apache/incubator-doris/pull/1798#discussion_r326082863
 
 

 ##
 File path: be/src/olap/delta_writer.cpp
 ##
 @@ -26,24 +26,32 @@
 
 namespace doris {
 
-OLAPStatus DeltaWriter::open(WriteRequest* req, DeltaWriter** writer) {
-*writer = new DeltaWriter(req);
+OLAPStatus DeltaWriter::open(
+WriteRequest* req,
+BlockingQueue>* flush_queue,
+DeltaWriter** writer) {
+*writer = new DeltaWriter(req, flush_queue);
 return OLAP_SUCCESS;
 }
 
-DeltaWriter::DeltaWriter(WriteRequest* req)
+DeltaWriter::DeltaWriter(
+WriteRequest* req,
+BlockingQueue>* flush_queue)
 : _req(*req), _tablet(nullptr),
   _cur_rowset(nullptr), _new_rowset(nullptr), _new_tablet(nullptr),
-  _rowset_writer(nullptr), _mem_table(nullptr),
-  _schema(nullptr), _tablet_schema(nullptr),
-  _delta_written_success(false) {}
+  _rowset_writer(nullptr), _schema(nullptr), _tablet_schema(nullptr),
+  _delta_written_success(false), _flush_status(OLAP_SUCCESS),
+  _flush_queue(flush_queue) {
+
+_mem_table.reset();
 
 Review comment:
   ??


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1798: Optimize the load performance for large file

2019-09-19 Thread GitBox
imay commented on a change in pull request #1798: Optimize the load performance 
for large file
URL: https://github.com/apache/incubator-doris/pull/1798#discussion_r326084947
 
 

 ##
 File path: be/src/olap/delta_writer.h
 ##
 @@ -75,10 +102,36 @@ class DeltaWriter {
 RowsetSharedPtr _new_rowset;
 TabletSharedPtr _new_tablet;
 std::unique_ptr _rowset_writer;
-MemTable* _mem_table;
+std::shared_ptr _mem_table;
 Schema* _schema;
 const TabletSchema* _tablet_schema;
 bool _delta_written_success;
+
+#if 0
+// the flush status of previous memtable.
+// the default is OLAP_SUCCESS, and once it changes to some ERROR code,
+// it will never change back to OLAP_SUCCESS.
+// this status will be checked each time the next memtable is going to be 
flushed,
+// so that if the previous flush is already failed, no need to flush next 
memtable.
+std::atomic _flush_status;
+// the future of the very last memtable flush execution.
+// because the flush of this delta writer's memtables are executed 
serially,
+// if the last memtable is flushed, all previous memtables should already 
be flushed.
+// so we only need to wait and block on the last memtable's flush future.
+std::future _flush_future;
+#endif
 
 Review comment:
   remove this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman merged pull request #1830: Split channel close operation into two phase

2019-09-19 Thread GitBox
morningman merged pull request #1830: Split channel close operation into two 
phase
URL: https://github.com/apache/incubator-doris/pull/1830
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated (17e52a4 -> aaabf97)

2019-09-19 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git.


from 17e52a4  Improve LRUCache to get better performance (#1826)
 add aaabf97  Split channel close operation into two phase (#1830)

No new revisions were added by this update.

Summary of changes:
 be/src/runtime/data_stream_sender.cpp | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli closed pull request #1828: Seek block when starts a ScanKey

2019-09-19 Thread GitBox
chaoyli closed pull request #1828: Seek block when starts a ScanKey
URL: https://github.com/apache/incubator-doris/pull/1828
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli opened a new pull request #1828: Seek block when starts a ScanKey

2019-09-19 Thread GitBox
chaoyli opened a new pull request #1828: Seek block when starts a ScanKey
URL: https://github.com/apache/incubator-doris/pull/1828
 
 
   In Doris, one block has 1024 rows.
   1. If the previous ScanKey scan rows multiple blocks,
   and also the final block has 1024 rows just right.
   2. The current ScanKey scan rows with number less than one block.
   Under the two conditions, if not seek block, the position of prefix shortkey 
columns is wrong.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated (e516eba -> 17e52a4)

2019-09-19 Thread lichaoyong
This is an automated email from the ASF dual-hosted git repository.

lichaoyong pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git.


from e516eba  Remove the "author" tag (#1829)
 add 17e52a4  Improve LRUCache to get better performance (#1826)

No new revisions were added by this update.

Summary of changes:
 be/src/olap/lru_cache.cpp | 215 ++
 be/src/olap/lru_cache.h   |  22 ++-
 be/src/runtime/memory/chunk_allocator.cpp |   4 +-
 be/src/runtime/memory/chunk_allocator.h   |   2 +-
 be/test/olap/lru_cache_test.cpp   |   6 -
 be/test/olap/page_cache_test.cpp  |   4 +-
 6 files changed, 144 insertions(+), 109 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] chaoyli merged pull request #1826: Improve LRUCache to get better performance

2019-09-19 Thread GitBox
chaoyli merged pull request #1826: Improve LRUCache to get better performance
URL: https://github.com/apache/incubator-doris/pull/1826
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay merged pull request #1829: Remove the "author" tag

2019-09-19 Thread GitBox
imay merged pull request #1829: Remove the "author" tag
URL: https://github.com/apache/incubator-doris/pull/1829
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated: Remove the "author" tag (#1829)

2019-09-19 Thread zhaoc
This is an automated email from the ASF dual-hosted git repository.

zhaoc pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
 new e516eba  Remove the "author" tag (#1829)
e516eba is described below

commit e516eba940133c2f80b85363c0039104f8e35849
Author: xy720 <22125576+xy...@users.noreply.github.com>
AuthorDate: Thu Sep 19 16:59:08 2019 +0800

Remove the "author" tag (#1829)
---
 fe/src/main/java/org/apache/doris/PaloFe.java  | 3 ---
 fe/src/main/java/org/apache/doris/alter/RollupJobV2.java   | 5 -
 fe/src/main/java/org/apache/doris/alter/SchemaChangeJobV2.java | 5 -
 fe/src/main/java/org/apache/doris/analysis/AdminSetConfigStmt.java | 5 -
 .../main/java/org/apache/doris/analysis/AdminShowConfigStmt.java   | 5 -
 fe/src/main/java/org/apache/doris/analysis/CancelLoadStmt.java | 3 ---
 fe/src/main/java/org/apache/doris/analysis/DropFileStmt.java   | 5 -
 fe/src/main/java/org/apache/doris/analysis/ImportColumnDesc.java   | 3 ---
 fe/src/main/java/org/apache/doris/analysis/ImportColumnsStmt.java  | 3 ---
 fe/src/main/java/org/apache/doris/analysis/ImportWhereStmt.java| 3 ---
 fe/src/main/java/org/apache/doris/catalog/AggregateType.java   | 3 ---
 fe/src/main/java/org/apache/doris/catalog/ColocateGroupSchema.java | 5 -
 fe/src/main/java/org/apache/doris/catalog/FunctionSet.java | 3 ---
 fe/src/main/java/org/apache/doris/catalog/PartitionType.java   | 3 ---
 fe/src/main/java/org/apache/doris/cluster/Cluster.java | 2 --
 fe/src/main/java/org/apache/doris/cluster/ClusterNamespace.java| 1 -
 .../java/org/apache/doris/common/LabelAlreadyUsedException.java| 5 -
 fe/src/main/java/org/apache/doris/common/proc/BaseProcNode.java| 3 ---
 fe/src/main/java/org/apache/doris/common/proc/BaseProcResult.java  | 3 ---
 .../java/org/apache/doris/common/proc/ClusterLoadStatByMedium.java | 5 -
 .../doris/common/proc/ColocationGroupBackendSeqsProcNode.java  | 5 -
 .../java/org/apache/doris/common/proc/LoadErrorHubProcNode.java| 3 ---
 .../main/java/org/apache/doris/common/proc/ProcDirInterface.java   | 3 ---
 .../main/java/org/apache/doris/common/proc/ProcNodeInterface.java  | 3 ---
 fe/src/main/java/org/apache/doris/common/proc/TransDbProcDir.java  | 5 -
 .../java/org/apache/doris/common/proc/TransPartitionProcNode.java  | 5 -
 fe/src/main/java/org/apache/doris/common/proc/TransProcDir.java| 5 -
 .../main/java/org/apache/doris/common/proc/TransStateProcDir.java  | 5 -
 .../main/java/org/apache/doris/common/proc/TransTablesProcDir.java | 5 -
 fe/src/main/java/org/apache/doris/common/util/MysqlUtil.java   | 3 ---
 fe/src/main/java/org/apache/doris/common/util/SmallFileMgr.java| 5 -
 fe/src/main/java/org/apache/doris/external/EsStateStore.java   | 1 -
 fe/src/main/java/org/apache/doris/metric/GaugeMetricImpl.java  | 7 +--
 fe/src/main/java/org/apache/doris/metric/MetricCalculator.java | 7 +--
 .../main/java/org/apache/doris/metric/SimpleCoreMetricVisitor.java | 5 -
 fe/src/main/java/org/apache/doris/persist/BackendTabletsInfo.java  | 5 -
 .../main/java/org/apache/doris/persist/RoutineLoadOperation.java   | 5 -
 fe/src/main/java/org/apache/doris/qe/DdlExecutor.java  | 3 ---
 .../java/org/apache/doris/transaction/TxnStateCallbackFactory.java | 5 -
 fe/src/test/java/org/apache/doris/alter/RollupJobV2Test.java   | 5 -
 fe/src/test/java/org/apache/doris/alter/SchemaChangeJobV2Test.java | 5 -
 .../test/java/org/apache/doris/catalog/ColocateTableIndexTest.java | 5 -
 fe/src/test/java/org/apache/doris/http/MimeTypeTest.java   | 5 -
 fe/src/test/java/org/apache/doris/mysql/WrappedAuth.java   | 5 -
 .../java/org/apache/doris/mysql/privilege/SetPasswordTest.java | 5 -
 .../java/org/apache/doris/planner/HashDistributionPrunerTest.java  | 5 -
 fe/src/test/java/org/apache/doris/rewrite/FEFunctionsTest.java | 5 -
 47 files changed, 2 insertions(+), 196 deletions(-)

diff --git a/fe/src/main/java/org/apache/doris/PaloFe.java 
b/fe/src/main/java/org/apache/doris/PaloFe.java
index 028af19..21e08c9 100644
--- a/fe/src/main/java/org/apache/doris/PaloFe.java
+++ b/fe/src/main/java/org/apache/doris/PaloFe.java
@@ -48,9 +48,6 @@ import java.lang.management.ManagementFactory;
 import java.nio.channels.FileLock;
 import java.nio.channels.OverlappingFileLockException;
 
-/**
- * Created by zhaochun on 14-9-22.
- */
 public class PaloFe {
 private static final Logger LOG = LogManager.getLogger(PaloFe.class);
 
diff --git a/fe/src/main/java/org/apache/doris/alter/RollupJobV2.java 
b/fe/src/main/java/org/apache/doris/alter/RollupJobV2.java
index 8c6f8e6..a59cff5 100644
--- a/fe/src/main/java/org/apache/doris/alter/RollupJobV2.java
+++ 

[GitHub] [incubator-doris] morningman opened a new pull request #1831: Support setting timezone for stream load and routine load

2019-09-19 Thread GitBox
morningman opened a new pull request #1831: Support setting timezone for stream 
load and routine load
URL: https://github.com/apache/incubator-doris/pull/1831
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman commented on a change in pull request #1828: Seek block when starts a ScanKey

2019-09-19 Thread GitBox
morningman commented on a change in pull request #1828: Seek block when starts 
a ScanKey
URL: https://github.com/apache/incubator-doris/pull/1828#discussion_r325990989
 
 

 ##
 File path: be/src/olap/rowset/segment_reader.cpp
 ##
 @@ -832,7 +841,7 @@ OLAPStatus SegmentReader::_create_reader(size_t* 
buffer_size) {
 
 OLAPStatus SegmentReader::_seek_to_block_directly(
 int64_t block_id, const std::vector& cids) {
-if (!config::block_seek_position && _at_block_start && block_id == 
_current_block_id) {
+if (!_seek_block && block_id == _current_block_id) {
 
 Review comment:
   Can we remove the config `config::block_seek_position`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman commented on a change in pull request #1828: Seek block when starts a ScanKey

2019-09-19 Thread GitBox
morningman commented on a change in pull request #1828: Seek block when starts 
a ScanKey
URL: https://github.com/apache/incubator-doris/pull/1828#discussion_r325990885
 
 

 ##
 File path: be/src/olap/rowset/segment_reader.h
 ##
 @@ -308,7 +308,7 @@ class SegmentReader {
 
 // If this field is false, client must to call seek_to_block before
 // calling get_block.
-bool _at_block_start = false;
+bool _seek_block = true;
 
 Review comment:
   Change the comment of `_seek_block `


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HangyuanLiu edited a comment on issue #1825: HLL support default value

2019-09-19 Thread GitBox
HangyuanLiu edited a comment on issue #1825: HLL support default value
URL: https://github.com/apache/incubator-doris/pull/1825#issuecomment-533002312
 
 
   @kangkaisen 
   If we add a empty_hll function , so we should write load command like .
   
   LOAD LABEL test.uv 
   DATA INFILE ("hdfs://ns1017/**streamA**/*") INTO TABLE `test_uv`
   (pin, id, u1)
   set (
 uv1=hll_hash(u1),
 uv2= empty_hll(),
   );
   
   LOAD LABEL test.uv 
   DATA INFILE ("hdfs://ns1017/**streamB**/*") INTO TABLE `test_uv`
   (pin, id, u2)
   set (
 uv1= empty_hll(),
 uv2= hll_hash(u2),
   );
   
   uv2 may be another business stream
   It is unreasonable to let the import data for A business  care about the 
other B business HLL type.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HangyuanLiu edited a comment on issue #1825: HLL support default value

2019-09-19 Thread GitBox
HangyuanLiu edited a comment on issue #1825: HLL support default value
URL: https://github.com/apache/incubator-doris/pull/1825#issuecomment-533002312
 
 
   @kangkaisen 
   If we add a empty_hll function , so we should write load command like .
   
   LOAD LABEL test.uv 
   DATA INFILE ("hdfs://ns1017/**streamA**/*") INTO TABLE `test_uv`
   (pin, id, u1)
   set (
 uv1=hll_hash(u1),
 uv2=empty_hash(),
   );
   
   LOAD LABEL test.uv 
   DATA INFILE ("hdfs://ns1017/**streamB**/*") INTO TABLE `test_uv`
   (pin, id, u2)
   set (
 uv1= empty_hash(),
 uv2= hll_hash(u2),
   );
   
   uv2 may be another business stream
   It is unreasonable to let the import data for A business  care about the 
other B business HLL type.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HangyuanLiu commented on issue #1825: HLL support default value

2019-09-19 Thread GitBox
HangyuanLiu commented on issue #1825: HLL support default value
URL: https://github.com/apache/incubator-doris/pull/1825#issuecomment-533002312
 
 
   @kangkaisen 
   If we add a empty_hll function , so we should write load command like .
   
   LOAD LABEL test.uv 
   DATA INFILE ("hdfs://ns1017/**streamA**/*") INTO TABLE `test_uv`
   (pin, id, u1)
   set (
 uv1=hll_hash(u1),
 uv2=empty_hash(u2),
   );
   
   LOAD LABEL test.uv 
   DATA INFILE ("hdfs://ns1017/**streamB**/*") INTO TABLE `test_uv`
   (pin, id, u1)
   set (
 uv1= empty_hash(u1),
 uv2= hll_hash(u2),
   );
   
   uv2 may be another business stream
   It is unreasonable to let the import data for A business  care about the 
other B business HLL type.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay opened a new pull request #1830: Split channel close operation into two phase

2019-09-19 Thread GitBox
imay opened a new pull request #1830: Split channel close operation into two 
phase
URL: https://github.com/apache/incubator-doris/pull/1830
 
 
   In this change, channel close is finished into two phases. So we can
   close channels parallel, which can make query faster.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] kangkaisen commented on issue #1825: HLL support default value

2019-09-19 Thread GitBox
kangkaisen commented on issue #1825: HLL support default value
URL: https://github.com/apache/incubator-doris/pull/1825#issuecomment-532996610
 
 
   @HangyuanLiu Hi, I See.
   Could we add a `empty_hll` function support this capability? add then, we 
could still ensure the HLL input in `aggregate_func.h` will not null.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1828: Seek block when starts a ScanKey

2019-09-19 Thread GitBox
imay commented on a change in pull request #1828: Seek block when starts a 
ScanKey
URL: https://github.com/apache/incubator-doris/pull/1828#discussion_r326007261
 
 

 ##
 File path: be/src/olap/rowset/segment_reader.h
 ##
 @@ -308,7 +308,7 @@ class SegmentReader {
 
 // If this field is false, client must to call seek_to_block before
 // calling get_block.
-bool _at_block_start = false;
+bool _seek_block = true;
 
 Review comment:
   Field don't match comment. And I think _at_block_start name is easier to 
understand.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HangyuanLiu commented on issue #1825: HLL support default value

2019-09-19 Thread GitBox
HangyuanLiu commented on issue #1825: HLL support default value
URL: https://github.com/apache/incubator-doris/pull/1825#issuecomment-532987181
 
 
   This PR will support this capability.
   There is a scenario where a Doris table may have multiple HLL columns that 
may come from different streams. The A stream may not contain the required HLL 
columns in the B stream and may not have NULL fields. The number of fields in 
this stream may be less than the number of fields in the Doris table
   @kangkaisen 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1829: Remove the "author" tag

2019-09-19 Thread GitBox
imay commented on a change in pull request #1829: Remove the "author" tag
URL: https://github.com/apache/incubator-doris/pull/1829#discussion_r326004953
 
 

 ##
 File path: be/src/gutil/dynamic_annotations.c
 ##
 @@ -28,7 +28,6 @@
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * ---
- * Author: Kostya Serebryany
 
 Review comment:
   all files in be/src/gutil/, we should keep it


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1829: Remove the "author" tag

2019-09-19 Thread GitBox
imay commented on a change in pull request #1829: Remove the "author" tag
URL: https://github.com/apache/incubator-doris/pull/1829#discussion_r326004709
 
 

 ##
 File path: be/src/gutil/utf/rune.c
 ##
 @@ -1,5 +1,4 @@
 /*
- * The authors of this software are Rob Pike and Ken Thompson.
 
 Review comment:
   we should keep this because we copied from others


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1829: Remove the "author" tag

2019-09-19 Thread GitBox
imay commented on a change in pull request #1829: Remove the "author" tag
URL: https://github.com/apache/incubator-doris/pull/1829#discussion_r326004828
 
 

 ##
 File path: webroot/static/jquery.dataTables.js
 ##
 @@ -7,7 +7,6 @@
  * @description Paginate, search and order HTML tables
  * @version 1.10.12
  * @filejquery.dataTables.js
- * @author  SpryMedia Ltd (www.sprymedia.co.uk)
 
 Review comment:
   keep this


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] imay commented on a change in pull request #1829: Remove the "author" tag

2019-09-19 Thread GitBox
imay commented on a change in pull request #1829: Remove the "author" tag
URL: https://github.com/apache/incubator-doris/pull/1829#discussion_r326004684
 
 

 ##
 File path: be/src/gutil/utf/LICENSE
 ##
 @@ -1,6 +1,5 @@
 UTF-8 Library
 
-The authors of this software are Rob Pike and Ken Thompson.
 
 Review comment:
   keep this


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] kangkaisen commented on issue #1825: HLL support default value

2019-09-19 Thread GitBox
kangkaisen commented on issue #1825: HLL support default value
URL: https://github.com/apache/incubator-doris/pull/1825#issuecomment-532983775
 
 
   > curl --location-trusted -u root:123456 -H column_separator:, -H 
label:test_uv_15 -H "columns:pin_id,idx,u1,u2,id=12" -T uv_test 
http://11.40.166.162:8030/api/test/test_uv_10/_stream_load
   Result :hll_union_agg(uv1) and hll_union_agg(uv2) should be 0
   
   I think we don't support this usage.
   
   When load data with HLL column, we must use hll_hash function, hll_hash 
function will handle null value.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org