[ 
https://issues.apache.org/jira/browse/NIFI-7791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190887#comment-17190887
 ] 

Ricky Saltzer edited comment on NIFI-7791 at 9/4/20, 6:58 PM:
--------------------------------------------------------------

To provide a little more context to the processing portion. Here's a snippet of 
code that streams a FlowFile without doing any processing within NiFi. Instead, 
we tell ClickHouse "Hey, here's a stream of data...process it in real-time as 
CSV"

 getConnection().createStatement()
 {{   .write() // direct write call}}
 {{   .sql("INSERT INTO my_table")}}
 {{   .data(inputStream) // pass in FlowFile's InputStream}}
 {{   .format("CSV") // format the InputStream is expected to be}}
 {{   .send()}}


was (Author: rickysaltzer):
To provide a little more context to the processing portion. Here's a snippet of 
code that streams a FlowFile without doing any processing within NiFi. Instead, 
we tell ClickHouse "Hey, here's a stream of data...process it in real-time as 
CSV"


{{ getConnection().createStatement()}}
{{   .write() // direct write call}}
{{   .sql("INSERT INTO my_table")}}
{{   .data(inputStream) // pass in FlowFile's InputStream}}
{{   .format("CSV") // format the InputStream is expected to be}}
{{   .send()}}

> Add PutClickHouse Processor for Writing Large Streams
> -----------------------------------------------------
>
>                 Key: NIFI-7791
>                 URL: https://issues.apache.org/jira/browse/NIFI-7791
>             Project: Apache NiFi
>          Issue Type: New Feature
>            Reporter: Ricky Saltzer
>            Assignee: Ricky Saltzer
>            Priority: Minor
>
> ClickHouse supports streaming a number of file formats directly using their 
> JDBC (superset) library. Often times it's much more convenient to stream the 
> contents of a file directly to ClickHouse, rather than bothering to process 
> the data in NiFi and then using the native JDBC processor.
> One workaround is to just use PutHTTP to stream the file directly to 
> ClickHouse using it's HTTP endpoint. However, this can get a bit tedious, 
> especially if you need to pass credentials as part of the HTTP method call.
> I'm creating this Jira to support creating a simple PutClickHouse processor 
> that can stream a FlowFile directly to ClickHouse with the following features
>  * CSV, CSVWithNames, TSV and JSONEachRow
>  * Ability to modify column name ordering
>  * Custom delimiters for CSV and TSV
>  * SSL support (with and without strict mode)
>  * Multiple hosts (comma separated) to utilize the 
> {{BalancedClickhouseDataSource}}
>  * Username and Password
> I'm currently wrapping up a PR for this. I wrote it using Kotlin, which uses 
> a processor-scope maven plugin. If there's enough objection, it can be 
> rewritten in native Java.
> +[~joewitt] since I spoke with him regarding this a while back.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to