JulianJaffePinterest edited a comment on pull request #10920: URL: https://github.com/apache/druid/pull/10920#issuecomment-849417236
I've had a number of conversations about these connectors and how to use them. Common pain points are partitioning and user-friendliness, and so I've added three new partitioners, ergonomic ways to use the new partitioners, and a semi-typed way to configure the readers and writers. The improved ergonomics do come at the cost of introducing scala implicits into the project, which I have tried to avoid to ease comprehension for other developers. However, I think the tradeoff here is worth it. See the [extension documentation for more details](https://github.com/JulianJaffePinterest/druid/blob/spark_druid_connector/docs/development/extensions-core/spark.md) for more details. Example usages: Configuring the reader: ```scala import org.apache.druid.spark.DruidDataFrameReader sparkSession .read .brokerHost("localhost") .brokerPort(8082) .metadataDbType("mysql") .metadataUri("jdbc:mysql://druid.metadata.server:3306/druid") .metadataUser("druid") .metadataPassword("diurd") .dataSource("dataSource") .druid() ``` Configuring the writer: ```scala import org.apache.druid.spark.DruidDataFrameWriter val deepStorageConfig = new LocalDeepStorageConfig().storageDirectory("/mnt/druid/druid-segments/") df .write .metadataDbType("mysql") .metadataUri("jdbc:mysql://druid.metadata.server:3306/druid") .metadataUser("druid") .metadataPassword("diurd") .version(1) .deepStorage(deepStorageConfig) .mode(SaveMode.Overwrite) .dataSource("dataSource") .druid() ``` Using the new partitioners and the ergonomic approach to passing the a partition map to the writer: ```scala import org.apache.druid.spark.DruidDataFrame import org.apache.druid.spark.DruidDataFrameWriter val deepStorageConfig = new LocalDeepStorageConfig().storageDirectory("/mnt/druid/druid-segments/") df .rangePartitionerAndWrite(tsCol, tsFormat, granularityString, rowsPerPartition, partitionCol) .metadataDbType("mysql") .metadataUri("jdbc:mysql://druid.metadata.server:3306/druid") .metadataUser("druid") .metadataPassword("diurd") .version(1) .deepStorage(deepStorageConfig) .mode(SaveMode.Overwrite) .dataSource("dataSource") .druid() ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
