This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch branch-2.1
in repository https://gitbox.apache.org/repos/asf/doris.git

commit e10defeaba5c1e331085ace738e0ef303daaae12
Author: echo-dundun <[email protected]>
AuthorDate: Sun Feb 4 11:02:02 2024 +0800

    [enhancement](plubin)support json format and other options in logstash 
doris output plugin (#27318)
---
 docs/en/docs/ecosystem/logstash.md               |  6 ++++++
 docs/zh-CN/docs/ecosystem/logstash.md            |  6 ++++++
 extension/logstash/lib/logstash/outputs/doris.rb | 21 +++++++++++++++++++++
 extension/logstash/logstash-output-doris.gemspec |  2 +-
 4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/docs/en/docs/ecosystem/logstash.md 
b/docs/en/docs/ecosystem/logstash.md
index 9db71bc2e90..c0f89fa8aac 100644
--- a/docs/en/docs/ecosystem/logstash.md
+++ b/docs/en/docs/ecosystem/logstash.md
@@ -98,6 +98,12 @@ Configuration | Explanation
 `strict_mode` | Strict mode, the default is false
 `timezone` | Specify the time zone used for this import, the default is the 
East Eight District
 `exec_mem_limit` | Import memory limit, default is 2GB, unit is byte
+`format` | Specify the format of imported data, support csv, json, 
csv_with_names, csv_with_names_and_types, parquet, orc, default is csv.
+`jsonpaths` | The matching pattern needs to match the corresponding value via 
the jsonpaths parameter.
+`json_root` | Used to specify the root node of a json document, default value 
is "".
+`fuzzy_parse` | Boolean type, true means that json will be parsed with the 
schema of the first row. Enabling this option can improve the efficiency of 
json import, but requires that the order of the keys of all json objects is the 
same as the first row, the default is false, only use in json format
+`num_as_string`  | Boolean type, true means that when parsing json data, the 
numeric type will be converted to a string, and then imported without losing 
precision.
+
 
 Other configuration:
 
diff --git a/docs/zh-CN/docs/ecosystem/logstash.md 
b/docs/zh-CN/docs/ecosystem/logstash.md
index a436dd8096d..c0687c95a25 100644
--- a/docs/zh-CN/docs/ecosystem/logstash.md
+++ b/docs/zh-CN/docs/ecosystem/logstash.md
@@ -98,6 +98,12 @@ copy logstash-output-doris-{version}.gem 到 logstash 安装目录下
 `strict_mode` | 严格模式,默认为false。
 `timezone` | 指定本次导入所使用的时区,默认为东八区。
 `exec_mem_limit` | 导入内存限制,默认为 2GB,单位为字节。
+`format` | 指定导入数据格式,支持csv、json、 
csv_with_names、csv_with_names_and_types、parquet、orc,默认是csv。
+`jsonpaths` | 匹配模式需要通过jsonpaths参数匹配对应的value。
+`json_root` | 用于指定json document的根节点,默认值为""
+`fuzzy_parse` | 布尔类型,为true表示json将以第一行为schema 进行解析,开启这个选项可以提高 json 
导入效率,但是要求所有json 对象的key的顺序和第一行一致, 默认为false,仅用于json 格式。
+`num_as_string` | 为true表示在解析json数据时会将数字类型转为字符串,然后在确保不会出现精度丢失的情况下进行导入。
+
 
 其他配置
 
diff --git a/extension/logstash/lib/logstash/outputs/doris.rb 
b/extension/logstash/lib/logstash/outputs/doris.rb
index 15d1b4f006e..b7334aefb5f 100644
--- a/extension/logstash/lib/logstash/outputs/doris.rb
+++ b/extension/logstash/lib/logstash/outputs/doris.rb
@@ -68,6 +68,20 @@ class LogStash::Outputs::Doris < LogStash::Outputs::Base
    config :timezone, :validate => :string, :default => ""
    # memory limit of a stream load
    config :exec_mem_limit, :validate => :number, :default => -1
+   # Specify the format of imported data, csv and json are supported.
+   config :format, :validate => ['csv', 'json', 'csv_with_names', 
'csv_with_names_and_types', 'parquet', 'orc'], :default => "csv"
+   # jsonpaths example: jsonpaths => ["$.id", "$.type", "$.actor.id", 
"$.actor.login"]
+   config :jsonpaths, :validate => :array, :default => []
+   # Specify the root node of the json document
+   config :json_root, :validate => :string, :default => ""
+   # Boolean, true means the json will be parsed in the first row of the 
schema, turn on this option to improve the efficiency of json importing.
+   config :fuzzy_parse, :validate => :boolean, :default => false
+   # Parse json data converts numeric types to strings.
+   config :num_as_string, :validate => :boolean, :default => false
+   # true means support for reading one json object per line
+   config :read_json_by_line, :validate => :boolean, :default => false
+   #  
+
 
    # Custom headers to use
    # format is `headers => ["X-My-Header", "%{host}"]`
@@ -287,6 +301,13 @@ class LogStash::Outputs::Doris < LogStash::Outputs::Base
       if @columns != ""
           headers["columns"] ||= @columns
       end
+      headers["format"] = @format if @format != ""
+      headers["jsonpaths"] = @jsonpaths if @jsonpaths != []
+      headers["json_root"] = @json_root if @json_root != ""
+      headers["fuzzy_parse"] = @fuzzy_parse if @fuzzy_parse != ""
+      headers["num_as_string"] = @num_as_string if @num_as_string != ""
+      headers["read_json_by_line"] = @read_json_by_line if @read_json_by_line 
!= ""
+  
       headers
    end
 end # end of class LogStash::Outputs::Doris
diff --git a/extension/logstash/logstash-output-doris.gemspec 
b/extension/logstash/logstash-output-doris.gemspec
index 91d28f04fcb..ee456a0e42d 100644
--- a/extension/logstash/logstash-output-doris.gemspec
+++ b/extension/logstash/logstash-output-doris.gemspec
@@ -18,7 +18,7 @@ under the License.
 =end
 Gem::Specification.new do |s|
   s.name            = 'logstash-output-doris'
-  s.version         = '0.1.0'
+  s.version         = '0.2.0'
   s.author          = 'wfjcmcb'
   s.email           = '[email protected]'
   s.homepage        = 'http://doris.apache.org'


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to