JulianJaffePinterest edited a comment on pull request #10920:
URL: https://github.com/apache/druid/pull/10920#issuecomment-849417236


   I've had a number of conversations about these connectors and how to use 
them. Common pain points are partitioning and user-friendliness, and so I've 
added three new partitioners, ergonomic ways to use the new partitioners, and a 
semi-typed way to configure the readers and writers. The improved ergonomics do 
come at the cost of introducing scala implicits into the project, which I have 
tried to avoid to ease comprehension for other developers. However, I think the 
tradeoff here is worth it. See the [extension documentation for more 
details](https://github.com/JulianJaffePinterest/druid/blob/spark_druid_connector/docs/development/extensions-core/spark.md)
 for more details.
   
   Example usages:
   
   Configuring the reader:
   ```scala
   import org.apache.druid.spark.DruidDataFrameReader
   
   sparkSession
     .read
     .brokerHost("localhost")
     .brokerPort(8082)
     .metadataDbType("mysql")
     .metadataUri("jdbc:mysql://druid.metadata.server:3306/druid")
     .metadataUser("druid")
     .metadataPassword("diurd")
     .dataSource("dataSource")
     .druid()
   ```
   
   Configuring the writer:
   ```scala
   import org.apache.druid.spark.DruidDataFrameWriter
   
   val deepStorageConfig = new 
LocalDeepStorageConfig().storageDirectory("/mnt/druid/druid-segments/")
   
   df
     .write
     .metadataDbType("mysql")
     .metadataUri("jdbc:mysql://druid.metadata.server:3306/druid")
     .metadataUser("druid")
     .metadataPassword("diurd")
     .version(1)
     .deepStorage(deepStorageConfig)
     .mode(SaveMode.Overwrite)
     .dataSource("dataSource")
     .druid()
   ```
   
   Using the new partitioners and the ergonomic approach to passing the a 
partition map to the writer:
   ```scala
   import org.apache.druid.spark.DruidDataFrame
   import org.apache.druid.spark.DruidDataFrameWriter
   
   val deepStorageConfig = new 
LocalDeepStorageConfig().storageDirectory("/mnt/druid/druid-segments/")
   
   df
     .rangePartitionerAndWrite(tsCol, tsFormat, granularityString, 
rowsPerPartition, partitionCol)
     .metadataDbType("mysql")
     .metadataUri("jdbc:mysql://druid.metadata.server:3306/druid")
     .metadataUser("druid")
     .metadataPassword("diurd")
     .version(1)
     .deepStorage(deepStorageConfig)
     .mode(SaveMode.Overwrite)
     .dataSource("dataSource")
     .druid()
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to