[
https://issues.apache.org/jira/browse/SPARK-45666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
dinesh sachdev updated SPARK-45666:
-----------------------------------
Description:
We had a requirement to write Custom
org.apache.kafka.clients.producer.Partitioner to use with Kafka Data Source
available with package "{{{}spark-sql-kafka-0-10_2.11{}}} "
Ideally, properties set as part of Producer are available to Partitioner
method -
[configure|https://kafka.apache.org/24/javadoc/org/apache/kafka/common/Configurable.html#configure-java.util.Map-]([Map|https://docs.oracle.com/javase/8/docs/api/java/util/Map.html?is-external=true]<[String|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html?is-external=true],?>
configs)
But, we realized that Custom properties passed as options to Kafka format
DataFrameWriter are not available to Partitioner whether we append that
property with literal "kafka." or not.
Only, Configs listed on -
[https://kafka.apache.org/documentation/#producerconfigs] were passed to
Partitioner. But, in some cases it is required to pass custom properties for
initialization of Partitioner.
Thus, there should be provision to set custom properties as options with Kafka
Data Source not just producer configs. Otherwise, custom partitioner can't be
initialized and implemented as per need.
For example -
_df.write.format("kafka").option("Kafka.customproperty1",
"value1").option("kafka.partitioner.class", "com.mycustom.ipartitioner")_
..
...
.....
_package com.mycustom;_
_import org.apache.kafka.clients.producer.Partitioner;_
public class _ipartitioner implemets Partitioner{_
_@override_
_public void configure(Map<String,?> configs){_
_system.out.println(configs) //_ _Kafka.customproperty1 is missing here which
should be availble._
_}_
_}_
was:
We had a requirement to write Custom
org.apache.kafka.clients.producer.Partitioner to use with Kafka Data Source
available with package "{{{}spark-sql-kafka-0-10_2.11{}}} "
Ideally, properties set as part of Producer are available to Partitioner
method -
[configure|https://kafka.apache.org/24/javadoc/org/apache/kafka/common/Configurable.html#configure-java.util.Map-]([Map|https://docs.oracle.com/javase/8/docs/api/java/util/Map.html?is-external=true]<[String|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html?is-external=true],?>
configs)
But, we realized that Custom properties passed as options to Kafka format
DataFrameWriter are not available to Partitioner whether we append that
property with literal "kafka." or not.
Only, Configs listed on -
[https://kafka.apache.org/documentation/#producerconfigs] were passed to
Partitioner. But, in some cases it is required to pass custom properties for
initialization of Partitioner.
Thus, there should be provision to set custom properties as options with Kafka
Data Source not just producer configs. Otherwise, custom partitioner can't be
initialized and implemented as per need.
For example -
_df.write.format("kafka").option("Kafka.customproperty1",
"value1").option("kafka.partitioner.class", "com.mycustom.ipartitioner")_
public class _ipartitioner implemets Partitioner{_
_public void configure(Map<String,?> configs){_
_system.out.println(configs) //_ _Kafka.customproperty1 is missing here which
should be availble._
_}_
_}_
> spark-sql-kafka-0-10_2.11 - Custom Configuration's for Partitioner not set.
> ---------------------------------------------------------------------------
>
> Key: SPARK-45666
> URL: https://issues.apache.org/jira/browse/SPARK-45666
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.4.0
> Reporter: dinesh sachdev
> Priority: Major
>
> We had a requirement to write Custom
> org.apache.kafka.clients.producer.Partitioner to use with Kafka Data Source
> available with package "{{{}spark-sql-kafka-0-10_2.11{}}} "
> Ideally, properties set as part of Producer are available to Partitioner
> method -
> [configure|https://kafka.apache.org/24/javadoc/org/apache/kafka/common/Configurable.html#configure-java.util.Map-]([Map|https://docs.oracle.com/javase/8/docs/api/java/util/Map.html?is-external=true]<[String|https://docs.oracle.com/javase/8/docs/api/java/lang/String.html?is-external=true],?>
> configs)
>
>
> But, we realized that Custom properties passed as options to Kafka format
> DataFrameWriter are not available to Partitioner whether we append that
> property with literal "kafka." or not.
> Only, Configs listed on -
> [https://kafka.apache.org/documentation/#producerconfigs] were passed to
> Partitioner. But, in some cases it is required to pass custom properties for
> initialization of Partitioner.
> Thus, there should be provision to set custom properties as options with
> Kafka Data Source not just producer configs. Otherwise, custom partitioner
> can't be initialized and implemented as per need.
>
> For example -
> _df.write.format("kafka").option("Kafka.customproperty1",
> "value1").option("kafka.partitioner.class", "com.mycustom.ipartitioner")_
> ..
> ...
> .....
> _package com.mycustom;_
> _import org.apache.kafka.clients.producer.Partitioner;_
> public class _ipartitioner implemets Partitioner{_
> _@override_
> _public void configure(Map<String,?> configs){_
> _system.out.println(configs) //_ _Kafka.customproperty1 is missing here which
> should be availble._
> _}_
> _}_
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]