[ https://issues.apache.org/jira/browse/SPARK-20597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078395#comment-16078395 ]
Satyajit varma edited comment on SPARK-20597 at 7/7/17 5:10 PM: ---------------------------------------------------------------- Hi [~jlaskowski], I am almost done, with the above required change and i would like to confirm few things before i submit the PR. (SPARK-20597) 1.In the ticket when you say, "What seems a quite interesting option is to support start(path: String) as the least precedence option in which path would designate the default topic {color:#f6c342}when no other options are used{color}.". Were you referring to only option("topic","topic_name")? or any other option like option("checkpointLocation", ...) ? I would like to check on this with you because, we would end up getting "{color:#f6c342}org.apache.spark.sql.AnalysisException: checkpointLocation must be specified either through option("checkpointLocation", ...) or SparkSession.conf.set("spark.sql.streaming.checkpointLocation", ...);.{color}" error, if we try in executing the below line of code. df.writeStream.format("kafka").start("topic") because we have not provided any checkpointlocation option. 2.PFB code , that i am using to get the above functionality working, (This is in KafkaSourceProvider.scala) Line 145 // Picks the defaulttopicname from "path" key, an entry in "parameters" Map, // if no topic key is present in the "parameters" Map and is provided with key "path". val defaultTopic = parameters.get(TOPIC_OPTION_KEY) match { case None => parameters.get(PATH_OPTION_KEY) match { case path: Option[String] => parameters.get(PATH_OPTION_KEY).map(_.trim) case _ => None} case topic: Option[String] => parameters.get(TOPIC_OPTION_KEY).map(_.trim) } Let me know, if this looks okay, or if i am missing any more edge cases or something that i should be taking care of. I am trying to be very careful and because i am newbie , i would like the experts feedback to my above approach or any other feedback. if this looks good, i can set the same in createRelation method , Line 163(KafkaSourceProvider.scala), test it for the topic column option(our other scenario to test) and can submit the PR immediately. Regards, Satyajit. was (Author: satyajit): Hi [~jlaskowski], I am almost done, with the above required change and i would like to confirm few things before i submit the PR. (SPARK-20597) 1.In the ticket when you say, "What seems a quite interesting option is to support start(path: String) as the least precedence option in which path would designate the default topic when no other options are used.". Were you referring to only option("topic","topic_name")? or any other option like option("checkpointLocation", ...) ? I would like to check on this with you because, we would end up getting "org.apache.spark.sql.AnalysisException: checkpointLocation must be specified either through option("checkpointLocation", ...) or SparkSession.conf.set("spark.sql.streaming.checkpointLocation", ...);." error, if we try in executing the below line of code. df.writeStream.format("kafka").start("topic") because we have not provided any checkpointlocation option. 2.PFB code , that i am using to get the above functionality working, (This is in KafkaSourceProvider.scala) Line 145 // Picks the defaulttopicname from "path" key, an entry in "parameters" Map, // if no topic key is present in the "parameters" Map and is provided with key "path". val defaultTopic = parameters.get(TOPIC_OPTION_KEY) match { case None => parameters.get(PATH_OPTION_KEY) match { case path: Option[String] => parameters.get(PATH_OPTION_KEY).map(_.trim) case _ => None} case topic: Option[String] => parameters.get(TOPIC_OPTION_KEY).map(_.trim) } Let me know, if this looks okay, or if i am missing any more edge cases or something that i should be taking care of. I am trying to be very careful and because i am newbie , i would like the experts feedback to my above approach or any other feedback. if this looks good, i can set the same in createRelation method , Line 163(KafkaSourceProvider.scala), test it for the topic column option(our other scenario to test) and can submit the PR immediately. Regards, Satyajit. > KafkaSourceProvider falls back on path as synonym for topic > ----------------------------------------------------------- > > Key: SPARK-20597 > URL: https://issues.apache.org/jira/browse/SPARK-20597 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming > Affects Versions: 2.2.0 > Reporter: Jacek Laskowski > Priority: Trivial > Labels: starter > > # {{KafkaSourceProvider}} supports {{topic}} option that sets the Kafka topic > to save a DataFrame's rows to > # {{KafkaSourceProvider}} can use {{topic}} column to assign rows to Kafka > topics for writing > What seems a quite interesting option is to support {{start(path: String)}} > as the least precedence option in which {{path}} would designate the default > topic when no other options are used. > {code} > df.writeStream.format("kafka").start("topic") > {code} > See > http://apache-spark-developers-list.1001551.n3.nabble.com/KafkaSourceProvider-Why-topic-option-and-column-without-reverting-to-path-as-the-least-priority-td21458.html > for discussion -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org