jihoonson commented on a change in pull request #9449:
URL: https://github.com/apache/druid/pull/9449#discussion_r421836358
##########
File path: docs/ingestion/native-batch.md
##########
@@ -1310,6 +1311,43 @@ A spec that applies a filter and reads a subset of the
original datasource's col
This spec above will only return the `page`, `user` dimensions and `added`
metric.
Only rows where `page` = `Druid` will be returned.
+### Sql Input Source
+
+The SQL input source is used to read data directly from RDBMS.
+The SQL input source is _splittable_ and can be used by the [Parallel
task](#parallel-task), where each worker task will read from one SQL query from
the list of queries.
+Since this input source has a fixed input format for reading events, no
`inputFormat` field needs to be specified in the ingestion spec when using this
input source.
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|This should be "sql".|Yes|
+|database|Specifies the database connection details.|Yes|
Review comment:
Would you add more detailed docs for this parameter? It should probably
mention that you have to load some extension to read from a particular type of
database.
##########
File path: docs/ingestion/native-batch.md
##########
@@ -1310,6 +1311,43 @@ A spec that applies a filter and reads a subset of the
original datasource's col
This spec above will only return the `page`, `user` dimensions and `added`
metric.
Only rows where `page` = `Druid` will be returned.
+### Sql Input Source
Review comment:
One more thing, I remember that many people from our community have been
asking about how to use `SqlFirehose`. What do you think about adding a section
that explains how to use it in production environment? To be honest, it's not
clear for me what are best practices to make a scalable and efficient pipeline
using this input source. For example, how do you parallelize each ingestion
task (which means, how do you split queries)? How do you handle data updates in
database after ingestion? How often do you run ingestion jobs? and so on.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]