[
https://issues.apache.org/jira/browse/NIFI-7791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190887#comment-17190887
]
Ricky Saltzer edited comment on NIFI-7791 at 9/4/20, 6:58 PM:
--------------------------------------------------------------
To provide a little more context to the processing portion. Here's a snippet of
code that streams a FlowFile without doing any processing within NiFi. Instead,
we tell ClickHouse "Hey, here's a stream of data...process it in real-time as
CSV"
getConnection().createStatement()
{{ .write() // direct write call}}
{{ .sql("INSERT INTO my_table")}}
{{ .data(inputStream) // pass in FlowFile's InputStream}}
{{ .format("CSV") // format the InputStream is expected to be}}
{{ .send()}}
was (Author: rickysaltzer):
To provide a little more context to the processing portion. Here's a snippet of
code that streams a FlowFile without doing any processing within NiFi. Instead,
we tell ClickHouse "Hey, here's a stream of data...process it in real-time as
CSV"
{{ getConnection().createStatement()}}
{{ .write() // direct write call}}
{{ .sql("INSERT INTO my_table")}}
{{ .data(inputStream) // pass in FlowFile's InputStream}}
{{ .format("CSV") // format the InputStream is expected to be}}
{{ .send()}}
> Add PutClickHouse Processor for Writing Large Streams
> -----------------------------------------------------
>
> Key: NIFI-7791
> URL: https://issues.apache.org/jira/browse/NIFI-7791
> Project: Apache NiFi
> Issue Type: New Feature
> Reporter: Ricky Saltzer
> Assignee: Ricky Saltzer
> Priority: Minor
>
> ClickHouse supports streaming a number of file formats directly using their
> JDBC (superset) library. Often times it's much more convenient to stream the
> contents of a file directly to ClickHouse, rather than bothering to process
> the data in NiFi and then using the native JDBC processor.
> One workaround is to just use PutHTTP to stream the file directly to
> ClickHouse using it's HTTP endpoint. However, this can get a bit tedious,
> especially if you need to pass credentials as part of the HTTP method call.
> I'm creating this Jira to support creating a simple PutClickHouse processor
> that can stream a FlowFile directly to ClickHouse with the following features
> * CSV, CSVWithNames, TSV and JSONEachRow
> * Ability to modify column name ordering
> * Custom delimiters for CSV and TSV
> * SSL support (with and without strict mode)
> * Multiple hosts (comma separated) to utilize the
> {{BalancedClickhouseDataSource}}
> * Username and Password
> I'm currently wrapping up a PR for this. I wrote it using Kotlin, which uses
> a processor-scope maven plugin. If there's enough objection, it can be
> rewritten in native Java.
> +[~joewitt] since I spoke with him regarding this a while back.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)