subject:"RE\: Spark Kafka Integration"

RE: Spark Kafka Integration

2022-02-25 Thread Michael Williams (SSI)

Ahh, ok.  So, Kafka 3.1 is supported for Spark 3.2.1.  Thank you very much.

From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
Sent: Friday, February 25, 2022 2:50 PM
To: Michael Williams (SSI) 
Cc: user@spark.apache.org
Subject: Re: Spark Kafka Integration

these are the old and news ones

For spark 3.1.1 I needed these jar files to make it work

kafka-clients-2.7.0.jar  --> 
kafka-clients-3.1.0.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/apache/kafka/kafka-clients/3.1.0/kafka-clients-3.1.0.jar__;!!IPetXT4!kJE9FXhR63484NpSMIlBVr4evPJX7uzsB8-Yyaij23Vi19p8rIhJ9VGv_odk5bK3y6aqXdI$>
commons-pool2-2.9.0.jar  --> 
commons-pool2-2.11.1.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/apache/commons/commons-pool2/2.11.1/commons-pool2-2.11.1.jar__;!!IPetXT4!kJE9FXhR63484NpSMIlBVr4evPJX7uzsB8-Yyaij23Vi19p8rIhJ9VGv_odk5bK3hIXV3OM$>
spark-streaming_2.12-3.1.1.jar  --> 
spark-streaming_2.12-3.2.1.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/apache/spark/spark-streaming_2.12/3.2.1/spark-streaming_2.12-3.2.1.jar__;!!IPetXT4!kJE9FXhR63484NpSMIlBVr4evPJX7uzsB8-Yyaij23Vi19p8rIhJ9VGv_odk5bK3iEMgkFE$>
spark-sql-kafka-0-10_2.12-3.1.0.jar -> 
spark-sql-kafka-0-10_2.12-3.2.1.jar<https://urldefense.com/v3/__https:/repo1.maven.org/maven2/org/apache/spark/spark-sql-kafka-0-10_2.12/3.2.1/spark-sql-kafka-0-10_2.12-3.2.1.jar__;!!IPetXT4!kJE9FXhR63484NpSMIlBVr4evPJX7uzsB8-Yyaij23Vi19p8rIhJ9VGv_odk5bK3hhNtm10$>



HTH


 
[https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!kJE9FXhR63484NpSMIlBVr4evPJX7uzsB8-Yyaij23Vi19p8rIhJ9VGv_odk5bK3Zlx4tlQ$>

 
https://en.everybodywiki.com/Mich_Talebzadeh<https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!kJE9FXhR63484NpSMIlBVr4evPJX7uzsB8-Yyaij23Vi19p8rIhJ9VGv_odk5bK3nARtlMw$>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




On Fri, 25 Feb 2022 at 20:40, Michael Williams (SSI) 
mailto:michael.willi...@ssigroup.com>> wrote:
I believe it is 3.1, but if there is a different version that “works better” 
with spark, any advice would be appreciated.  Our entire team is totally new to 
spark and kafka (this is a poc trial).

From: Mich Talebzadeh 
[mailto:mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>]
Sent: Friday, February 25, 2022 2:30 PM
To: Michael Williams (SSI) 
mailto:michael.willi...@ssigroup.com>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Re: Spark Kafka Integration

and what version of kafka do you have 2.7?

for spark 3.1.1 I needed these jar files to make it work

kafka-clients-2.7.0.jar
commons-pool2-2.9.0.jar
spark-streaming_2.12-3.1.1.jar
spark-sql-kafka-0-10_2.12-3.1.0.jar



HTH


 
[https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6MRS0hIg$>

 
https://en.everybodywiki.com/Mich_Talebzadeh<https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6-ShVdmU$>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




On Fri, 25 Feb 2022 at 20:15, Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>> wrote:
What is the use case? Is this for spark structured streaming?

HTH




 
[https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6MRS0hIg$>

 
https://en.everybodywiki.com/Mich_Talebzadeh<https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6-ShVdmU$>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other propert

RE: Spark Kafka Integration

2022-02-25 Thread Michael Williams (SSI)

Thank you, that is good to know.

From: Sean Owen [mailto:sro...@gmail.com]
Sent: Friday, February 25, 2022 2:46 PM
To: Michael Williams (SSI) 
Cc: Mich Talebzadeh ; user@spark.apache.org
Subject: Re: Spark Kafka Integration

Spark 3.2.1 is compiled vs Kafka 2.8.0; the forthcoming Spark 3.3 against Kafka 
3.1.0.
It may well be mutually compatible though.

On Fri, Feb 25, 2022 at 2:40 PM Michael Williams (SSI) 
mailto:michael.willi...@ssigroup.com>> wrote:
I believe it is 3.1, but if there is a different version that “works better” 
with spark, any advice would be appreciated.  Our entire team is totally new to 
spark and kafka (this is a poc trial).

From: Mich Talebzadeh 
[mailto:mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>]
Sent: Friday, February 25, 2022 2:30 PM
To: Michael Williams (SSI) 
mailto:michael.willi...@ssigroup.com>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Re: Spark Kafka Integration

and what version of kafka do you have 2.7?

for spark 3.1.1 I needed these jar files to make it work

kafka-clients-2.7.0.jar
commons-pool2-2.9.0.jar
spark-streaming_2.12-3.1.1.jar
spark-sql-kafka-0-10_2.12-3.1.0.jar

HTH

[https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6MRS0hIg$>

https://en.everybodywiki.com/Mich_Talebzadeh<https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6-ShVdmU$>

Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.

On Fri, 25 Feb 2022 at 20:15, Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>> wrote:
What is the use case? Is this for spark structured streaming?

HTH

[https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6MRS0hIg$>

https://en.everybodywiki.com/Mich_Talebzadeh<https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6-ShVdmU$>

Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.

On Fri, 25 Feb 2022 at 19:38, Michael Williams (SSI) 
mailto:michael.willi...@ssigroup.com>> wrote:
After reviewing Spark's Kafka Integration guide, it indicates that 
spark-sql-kafka-0-10_2.12_3.2.1.jar and its dependencies are needed for Spark 
3.2.1 (+ Scala 2.12) to work with Kafka.  Can anybody clarify the cleanest, 
most repeatable (reliable) way to acquire these jars for including in a Spark 
Docker image without introducing version compatibility issues?

Thank you,
Mike

This electronic message may contain information that is Proprietary, 
Confidential, or legally privileged or protected. It is intended only for the 
use of the individual(s) and entity named in the message. If you are not an 
intended recipient of this message, please notify the sender immediately and 
delete the material from your computer. Do not deliver, distribute or copy this 
message and do not disclose its contents or take any action in reliance on the 
information it contains. Thank You.

This electronic message may contain information that is Proprietary, 
Confidential, or legally privileged or protected. It is intended only for the 
use of the individual(s) and entity named in the message. If you are not an 
intended recipient of this message, please notify the sender immediately and 
delete the material from your computer. Do not deliver, distribute or copy this 
message and do not disclose its contents or take any action in reliance on the 
information it contains. Thank You.

This electronic message may contain information that is Proprietary, 
Confidential, or legally privileged or protected. It is intended only for the 
use of the individual(s) and entity named in the message. If you are not an 
intended recipient of this message, please notify the sender immediately and 
delete the material from your computer. Do not d

Re: Spark Kafka Integration

2022-02-25 Thread Mich Talebzadeh

these are the old and news ones

For spark 3.1.1 I needed these jar files to make it work

kafka-clients-2.7.0.jar  --> kafka-clients-3.1.0.jar
<https://repo1.maven.org/maven2/org/apache/kafka/kafka-clients/3.1.0/kafka-clients-3.1.0.jar>
commons-pool2-2.9.0.jar  --> commons-pool2-2.11.1.jar
<https://repo1.maven.org/maven2/org/apache/commons/commons-pool2/2.11.1/commons-pool2-2.11.1.jar>
spark-streaming_2.12-3.1.1.jar  --> spark-streaming_2.12-3.2.1.jar
<https://repo1.maven.org/maven2/org/apache/spark/spark-streaming_2.12/3.2.1/spark-streaming_2.12-3.2.1.jar>
spark-sql-kafka-0-10_2.12-3.1.0.jar -> spark-sql-kafka-0-10_2.12-3.2.1.jar
<https://repo1.maven.org/maven2/org/apache/spark/spark-sql-kafka-0-10_2.12/3.2.1/spark-sql-kafka-0-10_2.12-3.2.1.jar>


HTH


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Fri, 25 Feb 2022 at 20:40, Michael Williams (SSI) <
michael.willi...@ssigroup.com> wrote:

> I believe it is 3.1, but if there is a different version that “works
> better” with spark, any advice would be appreciated.  Our entire team is
> totally new to spark and kafka (this is a poc trial).
>
>
>
> *From:* Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
> *Sent:* Friday, February 25, 2022 2:30 PM
> *To:* Michael Williams (SSI) 
> *Cc:* user@spark.apache.org
> *Subject:* Re: Spark Kafka Integration
>
>
>
> and what version of kafka do you have 2.7?
>
>
>
> for spark 3.1.1 I needed these jar files to make it work
>
>
>
> kafka-clients-2.7.0.jar
> commons-pool2-2.9.0.jar
> spark-streaming_2.12-3.1.1.jar
> spark-sql-kafka-0-10_2.12-3.1.0.jar
>
>
>
> HTH
>
>
>
>view my Linkedin profile
> <https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6MRS0hIg$>
>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
> <https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6-ShVdmU$>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
>
>
>
> On Fri, 25 Feb 2022 at 20:15, Mich Talebzadeh 
> wrote:
>
> What is the use case? Is this for spark structured streaming?
>
>
>
> HTH
>
>
>
>
>
>view my Linkedin profile
> <https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6MRS0hIg$>
>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
> <https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6-ShVdmU$>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
>
>
>
> On Fri, 25 Feb 2022 at 19:38, Michael Williams (SSI) <
> michael.willi...@ssigroup.com> wrote:
>
> After reviewing Spark's Kafka Integration guide, it indicates that
> spark-sql-kafka-0-10_2.12_3.2.1.jar and its dependencies are needed for
> Spark 3.2.1 (+ Scala 2.12) to work with Kafka.  Can anybody clarify the
> cleanest, most repeatable (reliable) way to acquire these jars for
> including in a Spark Docker image without introducing version compatibility
> issues?
>
>
>
> Thank you,
>
> Mike
>
>
>
> This electronic message may contain information that is Proprietary,
> Confidential, or legally privileged or protected. It is intended only for
> the use of the individual(s) and entity named in the message. If you are
> not an intended recipient of this message, please notify the sender
> immediately and delete the material from your computer. Do not deli

Re: Spark Kafka Integration

2022-02-25 Thread Sean Owen

Spark 3.2.1 is compiled vs Kafka 2.8.0; the forthcoming Spark 3.3 against
Kafka 3.1.0.
It may well be mutually compatible though.

On Fri, Feb 25, 2022 at 2:40 PM Michael Williams (SSI) <
michael.willi...@ssigroup.com> wrote:

> I believe it is 3.1, but if there is a different version that “works
> better” with spark, any advice would be appreciated.  Our entire team is
> totally new to spark and kafka (this is a poc trial).
>
>
>
> *From:* Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
> *Sent:* Friday, February 25, 2022 2:30 PM
> *To:* Michael Williams (SSI) 
> *Cc:* user@spark.apache.org
> *Subject:* Re: Spark Kafka Integration
>
>
>
> and what version of kafka do you have 2.7?
>
>
>
> for spark 3.1.1 I needed these jar files to make it work
>
>
>
> kafka-clients-2.7.0.jar
> commons-pool2-2.9.0.jar
> spark-streaming_2.12-3.1.1.jar
> spark-sql-kafka-0-10_2.12-3.1.0.jar
>
>
>
> HTH
>
>
>
>view my Linkedin profile
> <https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6MRS0hIg$>
>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
> <https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6-ShVdmU$>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
>
>
>
> On Fri, 25 Feb 2022 at 20:15, Mich Talebzadeh 
> wrote:
>
> What is the use case? Is this for spark structured streaming?
>
>
>
> HTH
>
>
>
>
>
>view my Linkedin profile
> <https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6MRS0hIg$>
>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
> <https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6-ShVdmU$>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
>
>
>
> On Fri, 25 Feb 2022 at 19:38, Michael Williams (SSI) <
> michael.willi...@ssigroup.com> wrote:
>
> After reviewing Spark's Kafka Integration guide, it indicates that
> spark-sql-kafka-0-10_2.12_3.2.1.jar and its dependencies are needed for
> Spark 3.2.1 (+ Scala 2.12) to work with Kafka.  Can anybody clarify the
> cleanest, most repeatable (reliable) way to acquire these jars for
> including in a Spark Docker image without introducing version compatibility
> issues?
>
>
>
> Thank you,
>
> Mike
>
>
>
> This electronic message may contain information that is Proprietary,
> Confidential, or legally privileged or protected. It is intended only for
> the use of the individual(s) and entity named in the message. If you are
> not an intended recipient of this message, please notify the sender
> immediately and delete the material from your computer. Do not deliver,
> distribute or copy this message and do not disclose its contents or take
> any action in reliance on the information it contains. Thank You.
>
>
> This electronic message may contain information that is Proprietary,
> Confidential, or legally privileged or protected. It is intended only for
> the use of the individual(s) and entity named in the message. If you are
> not an intended recipient of this message, please notify the sender
> immediately and delete the material from your computer. Do not deliver,
> distribute or copy this message and do not disclose its contents or take
> any action in reliance on the information it contains. Thank You.
>

RE: Spark Kafka Integration

2022-02-25 Thread Michael Williams (SSI)

Thank you

From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
Sent: Friday, February 25, 2022 2:35 PM
To: Michael Williams (SSI) 
Cc: Sean Owen ; user@spark.apache.org
Subject: Re: Spark Kafka Integration

please see my earlier reply for 3.1.1 tested and worked in Google Dataproc 
environment

Also this article of mine may be useful

Processing Change Data Capture with Spark Structured 
Streaming<https://urldefense.com/v3/__https:/www.linkedin.com/pulse/processing-change-data-capture-spark-structured-talebzadeh-ph-d-/__;!!IPetXT4!gikpPwpJyYjmHQoX8PFvKTkBOOZ8_AsRuHBnyxqAZxWiejSbWOTRn16Kh1Sd8F-4--GdsLI$>

HTH

[https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!gikpPwpJyYjmHQoX8PFvKTkBOOZ8_AsRuHBnyxqAZxWiejSbWOTRn16Kh1Sd8F-4OpVyh2c$>

https://en.everybodywiki.com/Mich_Talebzadeh<https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!gikpPwpJyYjmHQoX8PFvKTkBOOZ8_AsRuHBnyxqAZxWiejSbWOTRn16Kh1Sd8F-4-AqCCmk$>

Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.

On Fri, 25 Feb 2022 at 20:30, Michael Williams (SSI) 
mailto:michael.willi...@ssigroup.com>> wrote:
The use case is for spark structured streaming (a spark app will be launched by 
a worker service that monitors the kafka topic for new messages, once the 
messages are consumed, the spark app will terminate), but if there is a hitch 
here, it is that the Spark environment includes the MS dotnet for Spark 
wrapper, which means the each spark app will consume from one kafka topic and 
will be written in C#.  If possible, I’d really like to be able to manually 
download the necessary jars and do the kafka client installation as part of the 
docker image build, so that the dependencies already exist on disk.  If that 
makes any sense.

Thank you

From: Mich Talebzadeh 
[mailto:mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>]
Sent: Friday, February 25, 2022 2:16 PM
To: Michael Williams (SSI) 
mailto:michael.willi...@ssigroup.com>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Re: Spark Kafka Integration

What is the use case? Is this for spark structured streaming?

HTH

[https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!lCrzhG1VJ33tiaN9wEQbbz1YK22GuptNP0ttoU3MuEFoo5yhyYOunxqT6ntBaiGS-IYaQ48$>

https://en.everybodywiki.com/Mich_Talebzadeh<https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!lCrzhG1VJ33tiaN9wEQbbz1YK22GuptNP0ttoU3MuEFoo5yhyYOunxqT6ntBaiGSmSLsUws$>

Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.

On Fri, 25 Feb 2022 at 19:38, Michael Williams (SSI) 
mailto:michael.willi...@ssigroup.com>> wrote:
After reviewing Spark's Kafka Integration guide, it indicates that 
spark-sql-kafka-0-10_2.12_3.2.1.jar and its dependencies are needed for Spark 
3.2.1 (+ Scala 2.12) to work with Kafka.  Can anybody clarify the cleanest, 
most repeatable (reliable) way to acquire these jars for including in a Spark 
Docker image without introducing version compatibility issues?

Thank you,
Mike

This electronic message may contain information that is Proprietary, 
Confidential, or legally privileged or protected. It is intended only for the 
use of the individual(s) and entity named in the message. If you are not an 
intended recipient of this message, please notify the sender immediately and 
delete the material from your computer. Do not deliver, distribute or copy this 
message and do not disclose its contents or take any action in reliance on the 
information it contains. Thank You.

This electronic message may contain information that is Proprietary, 
Confidential, or legally privileged or protected. It is intended only for the 
use of the individual(s) and entity named in the message. If you are not an 
intended recipient of this message, please notify the sender immediately and 
delete the material from your computer. Do not deliver, distribute or copy this 
message and do not disclose its contents or take any a

RE: Spark Kafka Integration

2022-02-25 Thread Michael Williams (SSI)

I believe it is 3.1, but if there is a different version that “works better” 
with spark, any advice would be appreciated.  Our entire team is totally new to 
spark and kafka (this is a poc trial).

From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
Sent: Friday, February 25, 2022 2:30 PM
To: Michael Williams (SSI) 
Cc: user@spark.apache.org
Subject: Re: Spark Kafka Integration

and what version of kafka do you have 2.7?

for spark 3.1.1 I needed these jar files to make it work

kafka-clients-2.7.0.jar
commons-pool2-2.9.0.jar
spark-streaming_2.12-3.1.1.jar
spark-sql-kafka-0-10_2.12-3.1.0.jar

HTH

[https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6MRS0hIg$>

https://en.everybodywiki.com/Mich_Talebzadeh<https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6-ShVdmU$>

Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.

On Fri, 25 Feb 2022 at 20:15, Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>> wrote:
What is the use case? Is this for spark structured streaming?

HTH

[https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6MRS0hIg$>

https://en.everybodywiki.com/Mich_Talebzadeh<https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!gBJNAXItekUwDwvWFDOm6TyHbSrwbIYzHfz3Lgdat86WXOH09jgo72Z1eIt0YwL6-ShVdmU$>

Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.

On Fri, 25 Feb 2022 at 19:38, Michael Williams (SSI) 
mailto:michael.willi...@ssigroup.com>> wrote:
After reviewing Spark's Kafka Integration guide, it indicates that 
spark-sql-kafka-0-10_2.12_3.2.1.jar and its dependencies are needed for Spark 
3.2.1 (+ Scala 2.12) to work with Kafka.  Can anybody clarify the cleanest, 
most repeatable (reliable) way to acquire these jars for including in a Spark 
Docker image without introducing version compatibility issues?

Thank you,
Mike

This electronic message may contain information that is Proprietary, 
Confidential, or legally privileged or protected. It is intended only for the 
use of the individual(s) and entity named in the message. If you are not an 
intended recipient of this message, please notify the sender immediately and 
delete the material from your computer. Do not deliver, distribute or copy this 
message and do not disclose its contents or take any action in reliance on the 
information it contains. Thank You.

This electronic message may contain information that is Proprietary, 
Confidential, or legally privileged or protected. It is intended only for the 
use of the individual(s) and entity named in the message. If you are not an 
intended recipient of this message, please notify the sender immediately and 
delete the material from your computer. Do not deliver, distribute or copy this 
message and do not disclose its contents or take any action in reliance on the 
information it contains. Thank You.

Re: Spark Kafka Integration

2022-02-25 Thread Mich Talebzadeh

please see my earlier reply for 3.1.1 tested and worked in Google Dataproc
environment

Also this article of mine may be useful

Processing Change Data Capture with Spark Structured Streaming
<https://www.linkedin.com/pulse/processing-change-data-capture-spark-structured-talebzadeh-ph-d-/>

HTH



   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Fri, 25 Feb 2022 at 20:30, Michael Williams (SSI) <
michael.willi...@ssigroup.com> wrote:

> The use case is for spark structured streaming (a spark app will be
> launched by a worker service that monitors the kafka topic for new
> messages, once the messages are consumed, the spark app will terminate),
> but if there is a hitch here, it is that the Spark environment includes the
> MS dotnet for Spark wrapper, which means the each spark app will consume
> from one kafka topic and will be written in C#.  If possible, I’d really
> like to be able to manually download the necessary jars and do the kafka
> client installation as part of the docker image build, so that the
> dependencies already exist on disk.  If that makes any sense.
>
>
>
> Thank you
>
>
>
> *From:* Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
> *Sent:* Friday, February 25, 2022 2:16 PM
> *To:* Michael Williams (SSI) 
> *Cc:* user@spark.apache.org
> *Subject:* Re: Spark Kafka Integration
>
>
>
> What is the use case? Is this for spark structured streaming?
>
>
>
> HTH
>
>
>
>
>
>view my Linkedin profile
> <https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!lCrzhG1VJ33tiaN9wEQbbz1YK22GuptNP0ttoU3MuEFoo5yhyYOunxqT6ntBaiGS-IYaQ48$>
>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
> <https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!lCrzhG1VJ33tiaN9wEQbbz1YK22GuptNP0ttoU3MuEFoo5yhyYOunxqT6ntBaiGSmSLsUws$>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
>
>
>
> On Fri, 25 Feb 2022 at 19:38, Michael Williams (SSI) <
> michael.willi...@ssigroup.com> wrote:
>
> After reviewing Spark's Kafka Integration guide, it indicates that
> spark-sql-kafka-0-10_2.12_3.2.1.jar and its dependencies are needed for
> Spark 3.2.1 (+ Scala 2.12) to work with Kafka.  Can anybody clarify the
> cleanest, most repeatable (reliable) way to acquire these jars for
> including in a Spark Docker image without introducing version compatibility
> issues?
>
>
>
> Thank you,
>
> Mike
>
>
>
> This electronic message may contain information that is Proprietary,
> Confidential, or legally privileged or protected. It is intended only for
> the use of the individual(s) and entity named in the message. If you are
> not an intended recipient of this message, please notify the sender
> immediately and delete the material from your computer. Do not deliver,
> distribute or copy this message and do not disclose its contents or take
> any action in reliance on the information it contains. Thank You.
>
>
> This electronic message may contain information that is Proprietary,
> Confidential, or legally privileged or protected. It is intended only for
> the use of the individual(s) and entity named in the message. If you are
> not an intended recipient of this message, please notify the sender
> immediately and delete the material from your computer. Do not deliver,
> distribute or copy this message and do not disclose its contents or take
> any action in reliance on the information it contains. Thank You.
>

Re: Spark Kafka Integration

2022-02-25 Thread Mich Talebzadeh

and what version of kafka do you have 2.7?

for spark 3.1.1 I needed these jar files to make it work

kafka-clients-2.7.0.jar
commons-pool2-2.9.0.jar
spark-streaming_2.12-3.1.1.jar
spark-sql-kafka-0-10_2.12-3.1.0.jar


HTH


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Fri, 25 Feb 2022 at 20:15, Mich Talebzadeh 
wrote:

> What is the use case? Is this for spark structured streaming?
>
> HTH
>
>
>
>view my Linkedin profile
> 
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Fri, 25 Feb 2022 at 19:38, Michael Williams (SSI) <
> michael.willi...@ssigroup.com> wrote:
>
>> After reviewing Spark's Kafka Integration guide, it indicates that
>> spark-sql-kafka-0-10_2.12_3.2.1.jar and its dependencies are needed for
>> Spark 3.2.1 (+ Scala 2.12) to work with Kafka.  Can anybody clarify the
>> cleanest, most repeatable (reliable) way to acquire these jars for
>> including in a Spark Docker image without introducing version compatibility
>> issues?
>>
>>
>>
>> Thank you,
>>
>> Mike
>>
>>
>> This electronic message may contain information that is Proprietary,
>> Confidential, or legally privileged or protected. It is intended only for
>> the use of the individual(s) and entity named in the message. If you are
>> not an intended recipient of this message, please notify the sender
>> immediately and delete the material from your computer. Do not deliver,
>> distribute or copy this message and do not disclose its contents or take
>> any action in reliance on the information it contains. Thank You.
>>
>

RE: Spark Kafka Integration

2022-02-25 Thread Michael Williams (SSI)

The use case is for spark structured streaming (a spark app will be launched by 
a worker service that monitors the kafka topic for new messages, once the 
messages are consumed, the spark app will terminate), but if there is a hitch 
here, it is that the Spark environment includes the MS dotnet for Spark 
wrapper, which means the each spark app will consume from one kafka topic and 
will be written in C#.  If possible, I’d really like to be able to manually 
download the necessary jars and do the kafka client installation as part of the 
docker image build, so that the dependencies already exist on disk.  If that 
makes any sense.

Thank you

From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
Sent: Friday, February 25, 2022 2:16 PM
To: Michael Williams (SSI) 
Cc: user@spark.apache.org
Subject: Re: Spark Kafka Integration

What is the use case? Is this for spark structured streaming?

HTH




 
[https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://urldefense.com/v3/__https:/www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/__;!!IPetXT4!lCrzhG1VJ33tiaN9wEQbbz1YK22GuptNP0ttoU3MuEFoo5yhyYOunxqT6ntBaiGS-IYaQ48$>

 
https://en.everybodywiki.com/Mich_Talebzadeh<https://urldefense.com/v3/__https:/en.everybodywiki.com/Mich_Talebzadeh__;!!IPetXT4!lCrzhG1VJ33tiaN9wEQbbz1YK22GuptNP0ttoU3MuEFoo5yhyYOunxqT6ntBaiGSmSLsUws$>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




On Fri, 25 Feb 2022 at 19:38, Michael Williams (SSI) 
mailto:michael.willi...@ssigroup.com>> wrote:
After reviewing Spark's Kafka Integration guide, it indicates that 
spark-sql-kafka-0-10_2.12_3.2.1.jar and its dependencies are needed for Spark 
3.2.1 (+ Scala 2.12) to work with Kafka.  Can anybody clarify the cleanest, 
most repeatable (reliable) way to acquire these jars for including in a Spark 
Docker image without introducing version compatibility issues?

Thank you,
Mike


This electronic message may contain information that is Proprietary, 
Confidential, or legally privileged or protected. It is intended only for the 
use of the individual(s) and entity named in the message. If you are not an 
intended recipient of this message, please notify the sender immediately and 
delete the material from your computer. Do not deliver, distribute or copy this 
message and do not disclose its contents or take any action in reliance on the 
information it contains. Thank You.



This electronic message may contain information that is Proprietary, 
Confidential, or legally privileged or protected. It is intended only for the 
use of the individual(s) and entity named in the message. If you are not an 
intended recipient of this message, please notify the sender immediately and 
delete the material from your computer. Do not deliver, distribute or copy this 
message and do not disclose its contents or take any action in reliance on the 
information it contains. Thank You.

Re: Spark Kafka Integration

2022-02-25 Thread Sean Owen

That .jar is available on Maven, though typically you depend on it in your
app, and compile an uber JAR which will contain it and all its dependencies.
You can I suppose manage to compile an uber JAR from that dependency itself
with tools if needed.

On Fri, Feb 25, 2022 at 1:37 PM Michael Williams (SSI) <
michael.willi...@ssigroup.com> wrote:

> After reviewing Spark's Kafka Integration guide, it indicates that
> spark-sql-kafka-0-10_2.12_3.2.1.jar and its dependencies are needed for
> Spark 3.2.1 (+ Scala 2.12) to work with Kafka.  Can anybody clarify the
> cleanest, most repeatable (reliable) way to acquire these jars for
> including in a Spark Docker image without introducing version compatibility
> issues?
>
>
>
> Thank you,
>
> Mike
>
>
> This electronic message may contain information that is Proprietary,
> Confidential, or legally privileged or protected. It is intended only for
> the use of the individual(s) and entity named in the message. If you are
> not an intended recipient of this message, please notify the sender
> immediately and delete the material from your computer. Do not deliver,
> distribute or copy this message and do not disclose its contents or take
> any action in reliance on the information it contains. Thank You.
>

Re: Spark Kafka Integration

2022-02-25 Thread Mich Talebzadeh

What is the use case? Is this for spark structured streaming?

HTH



   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Fri, 25 Feb 2022 at 19:38, Michael Williams (SSI) <
michael.willi...@ssigroup.com> wrote:

> After reviewing Spark's Kafka Integration guide, it indicates that
> spark-sql-kafka-0-10_2.12_3.2.1.jar and its dependencies are needed for
> Spark 3.2.1 (+ Scala 2.12) to work with Kafka.  Can anybody clarify the
> cleanest, most repeatable (reliable) way to acquire these jars for
> including in a Spark Docker image without introducing version compatibility
> issues?
>
>
>
> Thank you,
>
> Mike
>
>
> This electronic message may contain information that is Proprietary,
> Confidential, or legally privileged or protected. It is intended only for
> the use of the individual(s) and entity named in the message. If you are
> not an intended recipient of this message, please notify the sender
> immediately and delete the material from your computer. Do not deliver,
> distribute or copy this message and do not disclose its contents or take
> any action in reliance on the information it contains. Thank You.
>

Re: Spark-Kafka integration - build failing with sbt

2017-06-19 Thread Cody Koeninger

org.apache.spark.streaming.kafka.KafkaUtils
is in the
spark-streaming-kafka-0-8
project

On Mon, Jun 19, 2017 at 1:01 PM, karan alang  wrote:
> Hi Cody - i do have a additional basic question ..
>
> When i tried to compile the code in Eclipse, i was not able to do that
>
> eg.
> import org.apache.spark.streaming.kafka.KafkaUtils
>
> gave errors saying KafaUtils was not part of the package.
> However, when i used sbt to compile - the compilation went through fine
>
> So, I assume additional libraries are being downloaded when i provide the
> appropriate packages in LibraryDependencies ?
> which ones would have helped compile this ?
>
>
>
> On Sat, Jun 17, 2017 at 2:53 PM, karan alang  wrote:
>>
>> Thanks, Cody .. yes, was able to fix that.
>>
>> On Sat, Jun 17, 2017 at 1:18 PM, Cody Koeninger 
>> wrote:
>>>
>>> There are different projects for different versions of kafka,
>>> spark-streaming-kafka-0-8 and spark-streaming-kafka-0-10
>>>
>>> See
>>>
>>> http://spark.apache.org/docs/latest/streaming-kafka-integration.html
>>>
>>> On Fri, Jun 16, 2017 at 6:51 PM, karan alang 
>>> wrote:
>>> > I'm trying to compile kafka & Spark Streaming integration code i.e.
>>> > reading
>>> > from Kafka using Spark Streaming,
>>> >   and the sbt build is failing with error -
>>> >
>>> >   [error] (*:update) sbt.ResolveException: unresolved dependency:
>>> > org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found
>>> >
>>> >   Scala version -> 2.10.7
>>> >   Spark Version -> 2.1.0
>>> >   Kafka version -> 0.9
>>> >   sbt version -> 0.13
>>> >
>>> > Contents of sbt files is as shown below ->
>>> >
>>> > 1)
>>> >   vi spark_kafka_code/project/plugins.sbt
>>> >
>>> >   addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2")
>>> >
>>> >  2)
>>> >   vi spark_kafka_code/sparkkafka.sbt
>>> >
>>> > import AssemblyKeys._
>>> > assemblySettings
>>> >
>>> > name := "SparkKafka Project"
>>> >
>>> > version := "1.0"
>>> > scalaVersion := "2.11.7"
>>> >
>>> > val sparkVers = "2.1.0"
>>> >
>>> > // Base Spark-provided dependencies
>>> > libraryDependencies ++= Seq(
>>> >   "org.apache.spark" %% "spark-core" % sparkVers % "provided",
>>> >   "org.apache.spark" %% "spark-streaming" % sparkVers % "provided",
>>> >   "org.apache.spark" %% "spark-streaming-kafka" % sparkVers)
>>> >
>>> > mergeStrategy in assembly := {
>>> >   case m if m.toLowerCase.endsWith("manifest.mf") =>
>>> > MergeStrategy.discard
>>> >   case m if m.toLowerCase.startsWith("META-INF")  =>
>>> > MergeStrategy.discard
>>> >   case "reference.conf"   =>
>>> > MergeStrategy.concat
>>> >   case m if m.endsWith("UnusedStubClass.class")   =>
>>> > MergeStrategy.discard
>>> >   case _ => MergeStrategy.first
>>> > }
>>> >
>>> >   i launch sbt, and then try to create an eclipse project, complete
>>> > error is
>>> > as shown below -
>>> >
>>> >   -
>>> >
>>> >   sbt
>>> > [info] Loading global plugins from /Users/karanalang/.sbt/0.13/plugins
>>> > [info] Loading project definition from
>>> >
>>> > /Users/karanalang/Documents/Technology/Coursera_spark_scala/spark_kafka_code/project
>>> > [info] Set current project to SparkKafka Project (in build
>>> >
>>> > file:/Users/karanalang/Documents/Technology/Coursera_spark_scala/spark_kafka_code/)
>>> >> eclipse
>>> > [info] About to create Eclipse project files for your project(s).
>>> > [info] Updating
>>> >
>>> > {file:/Users/karanalang/Documents/Technology/Coursera_spark_scala/spark_kafka_code/}spark_kafka_code...
>>> > [info] Resolving org.apache.spark#spark-streaming-kafka_2.11;2.1.0 ...
>>> > [warn] module not found:
>>> > org.apache.spark#spark-streaming-kafka_2.11;2.1.0
>>> > [warn]  local: tried
>>> > [warn]
>>> >
>>> > /Users/karanalang/.ivy2/local/org.apache.spark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
>>> > [warn]  activator-launcher-local: tried
>>> > [warn]
>>> >
>>> > /Users/karanalang/.activator/repository/org.apache.spark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
>>> > [warn]  activator-local: tried
>>> > [warn]
>>> >
>>> > /Users/karanalang/Documents/Technology/SCALA/activator-dist-1.3.10/repository/org.apache.spark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
>>> > [warn]  public: tried
>>> > [warn]
>>> >
>>> > https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom
>>> > [warn]  typesafe-releases: tried
>>> > [warn]
>>> >
>>> > http://repo.typesafe.com/typesafe/releases/org/apache/spark/spark-streaming-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom
>>> > [warn]  typesafe-ivy-releasez: tried
>>> > [warn]
>>> >
>>> > http://repo.typesafe.com/typesafe/ivy-releases/org.apache.spark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
>>> > [info] Resolving jline#jline;2.12.1 ...
>>> > [warn] ::
>>> > [warn] ::  UNRESOLVED

Re: Spark-Kafka integration - build failing with sbt

2017-06-19 Thread karan alang

Hi Cody - i do have a additional basic question ..

When i tried to compile the code in Eclipse, i was not able to do that

eg.
import org.apache.spark.streaming.kafka.KafkaUtils

gave errors saying KafaUtils was not part of the package.
However, when i used sbt to compile - the compilation went through fine

So, I assume additional libraries are being downloaded when i provide the
appropriate packages in LibraryDependencies ?
which ones would have helped compile this ?



On Sat, Jun 17, 2017 at 2:53 PM, karan alang  wrote:

> Thanks, Cody .. yes, was able to fix that.
>
> On Sat, Jun 17, 2017 at 1:18 PM, Cody Koeninger 
> wrote:
>
>> There are different projects for different versions of kafka,
>> spark-streaming-kafka-0-8 and spark-streaming-kafka-0-10
>>
>> See
>>
>> http://spark.apache.org/docs/latest/streaming-kafka-integration.html
>>
>> On Fri, Jun 16, 2017 at 6:51 PM, karan alang 
>> wrote:
>> > I'm trying to compile kafka & Spark Streaming integration code i.e.
>> reading
>> > from Kafka using Spark Streaming,
>> >   and the sbt build is failing with error -
>> >
>> >   [error] (*:update) sbt.ResolveException: unresolved dependency:
>> > org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found
>> >
>> >   Scala version -> 2.10.7
>> >   Spark Version -> 2.1.0
>> >   Kafka version -> 0.9
>> >   sbt version -> 0.13
>> >
>> > Contents of sbt files is as shown below ->
>> >
>> > 1)
>> >   vi spark_kafka_code/project/plugins.sbt
>> >
>> >   addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2")
>> >
>> >  2)
>> >   vi spark_kafka_code/sparkkafka.sbt
>> >
>> > import AssemblyKeys._
>> > assemblySettings
>> >
>> > name := "SparkKafka Project"
>> >
>> > version := "1.0"
>> > scalaVersion := "2.11.7"
>> >
>> > val sparkVers = "2.1.0"
>> >
>> > // Base Spark-provided dependencies
>> > libraryDependencies ++= Seq(
>> >   "org.apache.spark" %% "spark-core" % sparkVers % "provided",
>> >   "org.apache.spark" %% "spark-streaming" % sparkVers % "provided",
>> >   "org.apache.spark" %% "spark-streaming-kafka" % sparkVers)
>> >
>> > mergeStrategy in assembly := {
>> >   case m if m.toLowerCase.endsWith("manifest.mf") =>
>> MergeStrategy.discard
>> >   case m if m.toLowerCase.startsWith("META-INF")  =>
>> MergeStrategy.discard
>> >   case "reference.conf"   =>
>> MergeStrategy.concat
>> >   case m if m.endsWith("UnusedStubClass.class")   =>
>> MergeStrategy.discard
>> >   case _ => MergeStrategy.first
>> > }
>> >
>> >   i launch sbt, and then try to create an eclipse project, complete
>> error is
>> > as shown below -
>> >
>> >   -
>> >
>> >   sbt
>> > [info] Loading global plugins from /Users/karanalang/.sbt/0.13/plugins
>> > [info] Loading project definition from
>> > /Users/karanalang/Documents/Technology/Coursera_spark_scala/
>> spark_kafka_code/project
>> > [info] Set current project to SparkKafka Project (in build
>> > file:/Users/karanalang/Documents/Technology/Coursera_spark_
>> scala/spark_kafka_code/)
>> >> eclipse
>> > [info] About to create Eclipse project files for your project(s).
>> > [info] Updating
>> > {file:/Users/karanalang/Documents/Technology/Coursera_spark_
>> scala/spark_kafka_code/}spark_kafka_code...
>> > [info] Resolving org.apache.spark#spark-streaming-kafka_2.11;2.1.0 ...
>> > [warn] module not found:
>> > org.apache.spark#spark-streaming-kafka_2.11;2.1.0
>> > [warn]  local: tried
>> > [warn]
>> > /Users/karanalang/.ivy2/local/org.apache.spark/spark-streami
>> ng-kafka_2.11/2.1.0/ivys/ivy.xml
>> > [warn]  activator-launcher-local: tried
>> > [warn]
>> > /Users/karanalang/.activator/repository/org.apache.spark/spa
>> rk-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
>> > [warn]  activator-local: tried
>> > [warn]
>> > /Users/karanalang/Documents/Technology/SCALA/activator-dist-
>> 1.3.10/repository/org.apache.spark/spark-streaming-kafka_2.
>> 11/2.1.0/ivys/ivy.xml
>> > [warn]  public: tried
>> > [warn]
>> > https://repo1.maven.org/maven2/org/apache/spark/spark-stream
>> ing-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom
>> > [warn]  typesafe-releases: tried
>> > [warn]
>> > http://repo.typesafe.com/typesafe/releases/org/apache/spark/
>> spark-streaming-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom
>> > [warn]  typesafe-ivy-releasez: tried
>> > [warn]
>> > http://repo.typesafe.com/typesafe/ivy-releases/org.apache.
>> spark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
>> > [info] Resolving jline#jline;2.12.1 ...
>> > [warn] ::
>> > [warn] ::  UNRESOLVED DEPENDENCIES ::
>> > [warn] ::
>> > [warn] :: org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not
>> found
>> > [warn] ::
>> > [warn]
>> > [warn] Note: Unresolved dependencies path:
>> > [warn]

Re: Spark-Kafka integration - build failing with sbt

2017-06-17 Thread karan alang

Thanks, Cody .. yes, was able to fix that.

On Sat, Jun 17, 2017 at 1:18 PM, Cody Koeninger  wrote:

> There are different projects for different versions of kafka,
> spark-streaming-kafka-0-8 and spark-streaming-kafka-0-10
>
> See
>
> http://spark.apache.org/docs/latest/streaming-kafka-integration.html
>
> On Fri, Jun 16, 2017 at 6:51 PM, karan alang 
> wrote:
> > I'm trying to compile kafka & Spark Streaming integration code i.e.
> reading
> > from Kafka using Spark Streaming,
> >   and the sbt build is failing with error -
> >
> >   [error] (*:update) sbt.ResolveException: unresolved dependency:
> > org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found
> >
> >   Scala version -> 2.10.7
> >   Spark Version -> 2.1.0
> >   Kafka version -> 0.9
> >   sbt version -> 0.13
> >
> > Contents of sbt files is as shown below ->
> >
> > 1)
> >   vi spark_kafka_code/project/plugins.sbt
> >
> >   addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2")
> >
> >  2)
> >   vi spark_kafka_code/sparkkafka.sbt
> >
> > import AssemblyKeys._
> > assemblySettings
> >
> > name := "SparkKafka Project"
> >
> > version := "1.0"
> > scalaVersion := "2.11.7"
> >
> > val sparkVers = "2.1.0"
> >
> > // Base Spark-provided dependencies
> > libraryDependencies ++= Seq(
> >   "org.apache.spark" %% "spark-core" % sparkVers % "provided",
> >   "org.apache.spark" %% "spark-streaming" % sparkVers % "provided",
> >   "org.apache.spark" %% "spark-streaming-kafka" % sparkVers)
> >
> > mergeStrategy in assembly := {
> >   case m if m.toLowerCase.endsWith("manifest.mf") =>
> MergeStrategy.discard
> >   case m if m.toLowerCase.startsWith("META-INF")  =>
> MergeStrategy.discard
> >   case "reference.conf"   => MergeStrategy.concat
> >   case m if m.endsWith("UnusedStubClass.class")   =>
> MergeStrategy.discard
> >   case _ => MergeStrategy.first
> > }
> >
> >   i launch sbt, and then try to create an eclipse project, complete
> error is
> > as shown below -
> >
> >   -
> >
> >   sbt
> > [info] Loading global plugins from /Users/karanalang/.sbt/0.13/plugins
> > [info] Loading project definition from
> > /Users/karanalang/Documents/Technology/Coursera_spark_
> scala/spark_kafka_code/project
> > [info] Set current project to SparkKafka Project (in build
> > file:/Users/karanalang/Documents/Technology/Coursera_
> spark_scala/spark_kafka_code/)
> >> eclipse
> > [info] About to create Eclipse project files for your project(s).
> > [info] Updating
> > {file:/Users/karanalang/Documents/Technology/Coursera_
> spark_scala/spark_kafka_code/}spark_kafka_code...
> > [info] Resolving org.apache.spark#spark-streaming-kafka_2.11;2.1.0 ...
> > [warn] module not found:
> > org.apache.spark#spark-streaming-kafka_2.11;2.1.0
> > [warn]  local: tried
> > [warn]
> > /Users/karanalang/.ivy2/local/org.apache.spark/spark-
> streaming-kafka_2.11/2.1.0/ivys/ivy.xml
> > [warn]  activator-launcher-local: tried
> > [warn]
> > /Users/karanalang/.activator/repository/org.apache.spark/
> spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
> > [warn]  activator-local: tried
> > [warn]
> > /Users/karanalang/Documents/Technology/SCALA/activator-
> dist-1.3.10/repository/org.apache.spark/spark-streaming-
> kafka_2.11/2.1.0/ivys/ivy.xml
> > [warn]  public: tried
> > [warn]
> > https://repo1.maven.org/maven2/org/apache/spark/spark-
> streaming-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom
> > [warn]  typesafe-releases: tried
> > [warn]
> > http://repo.typesafe.com/typesafe/releases/org/apache/
> spark/spark-streaming-kafka_2.11/2.1.0/spark-streaming-
> kafka_2.11-2.1.0.pom
> > [warn]  typesafe-ivy-releasez: tried
> > [warn]
> > http://repo.typesafe.com/typesafe/ivy-releases/org.
> apache.spark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
> > [info] Resolving jline#jline;2.12.1 ...
> > [warn] ::
> > [warn] ::  UNRESOLVED DEPENDENCIES ::
> > [warn] ::
> > [warn] :: org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not
> found
> > [warn] ::
> > [warn]
> > [warn] Note: Unresolved dependencies path:
> > [warn] org.apache.spark:spark-streaming-kafka_2.11:2.1.0
> > (/Users/karanalang/Documents/Technology/Coursera_spark_
> scala/spark_kafka_code/sparkkafka.sbt#L12-16)
> > [warn]   +- sparkkafka-project:sparkkafka-project_2.11:1.0
> > [trace] Stack trace suppressed: run last *:update for the full output.
> > [error] (*:update) sbt.ResolveException: unresolved dependency:
> > org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found
> > [info] Updating
> > {file:/Users/karanalang/Documents/Technology/Coursera_
> spark_scala/spark_kafka_code/}spark_kafka_code...
> > [info] Resolving org.apache.spark#spark-streaming-kafka_2.11;2.1.0 ...
> > [warn] module not found:
> >

Re: Spark-Kafka integration - build failing with sbt

2017-06-17 Thread Cody Koeninger

There are different projects for different versions of kafka,
spark-streaming-kafka-0-8 and spark-streaming-kafka-0-10

See

http://spark.apache.org/docs/latest/streaming-kafka-integration.html

On Fri, Jun 16, 2017 at 6:51 PM, karan alang  wrote:
> I'm trying to compile kafka & Spark Streaming integration code i.e. reading
> from Kafka using Spark Streaming,
>   and the sbt build is failing with error -
>
>   [error] (*:update) sbt.ResolveException: unresolved dependency:
> org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found
>
>   Scala version -> 2.10.7
>   Spark Version -> 2.1.0
>   Kafka version -> 0.9
>   sbt version -> 0.13
>
> Contents of sbt files is as shown below ->
>
> 1)
>   vi spark_kafka_code/project/plugins.sbt
>
>   addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2")
>
>  2)
>   vi spark_kafka_code/sparkkafka.sbt
>
> import AssemblyKeys._
> assemblySettings
>
> name := "SparkKafka Project"
>
> version := "1.0"
> scalaVersion := "2.11.7"
>
> val sparkVers = "2.1.0"
>
> // Base Spark-provided dependencies
> libraryDependencies ++= Seq(
>   "org.apache.spark" %% "spark-core" % sparkVers % "provided",
>   "org.apache.spark" %% "spark-streaming" % sparkVers % "provided",
>   "org.apache.spark" %% "spark-streaming-kafka" % sparkVers)
>
> mergeStrategy in assembly := {
>   case m if m.toLowerCase.endsWith("manifest.mf") => MergeStrategy.discard
>   case m if m.toLowerCase.startsWith("META-INF")  => MergeStrategy.discard
>   case "reference.conf"   => MergeStrategy.concat
>   case m if m.endsWith("UnusedStubClass.class")   => MergeStrategy.discard
>   case _ => MergeStrategy.first
> }
>
>   i launch sbt, and then try to create an eclipse project, complete error is
> as shown below -
>
>   -
>
>   sbt
> [info] Loading global plugins from /Users/karanalang/.sbt/0.13/plugins
> [info] Loading project definition from
> /Users/karanalang/Documents/Technology/Coursera_spark_scala/spark_kafka_code/project
> [info] Set current project to SparkKafka Project (in build
> file:/Users/karanalang/Documents/Technology/Coursera_spark_scala/spark_kafka_code/)
>> eclipse
> [info] About to create Eclipse project files for your project(s).
> [info] Updating
> {file:/Users/karanalang/Documents/Technology/Coursera_spark_scala/spark_kafka_code/}spark_kafka_code...
> [info] Resolving org.apache.spark#spark-streaming-kafka_2.11;2.1.0 ...
> [warn] module not found:
> org.apache.spark#spark-streaming-kafka_2.11;2.1.0
> [warn]  local: tried
> [warn]
> /Users/karanalang/.ivy2/local/org.apache.spark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
> [warn]  activator-launcher-local: tried
> [warn]
> /Users/karanalang/.activator/repository/org.apache.spark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
> [warn]  activator-local: tried
> [warn]
> /Users/karanalang/Documents/Technology/SCALA/activator-dist-1.3.10/repository/org.apache.spark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
> [warn]  public: tried
> [warn]
> https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom
> [warn]  typesafe-releases: tried
> [warn]
> http://repo.typesafe.com/typesafe/releases/org/apache/spark/spark-streaming-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom
> [warn]  typesafe-ivy-releasez: tried
> [warn]
> http://repo.typesafe.com/typesafe/ivy-releases/org.apache.spark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
> [info] Resolving jline#jline;2.12.1 ...
> [warn] ::
> [warn] ::  UNRESOLVED DEPENDENCIES ::
> [warn] ::
> [warn] :: org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found
> [warn] ::
> [warn]
> [warn] Note: Unresolved dependencies path:
> [warn] org.apache.spark:spark-streaming-kafka_2.11:2.1.0
> (/Users/karanalang/Documents/Technology/Coursera_spark_scala/spark_kafka_code/sparkkafka.sbt#L12-16)
> [warn]   +- sparkkafka-project:sparkkafka-project_2.11:1.0
> [trace] Stack trace suppressed: run last *:update for the full output.
> [error] (*:update) sbt.ResolveException: unresolved dependency:
> org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found
> [info] Updating
> {file:/Users/karanalang/Documents/Technology/Coursera_spark_scala/spark_kafka_code/}spark_kafka_code...
> [info] Resolving org.apache.spark#spark-streaming-kafka_2.11;2.1.0 ...
> [warn] module not found:
> org.apache.spark#spark-streaming-kafka_2.11;2.1.0
> [warn]  local: tried
> [warn]
> /Users/karanalang/.ivy2/local/org.apache.spark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
> [warn]  activator-launcher-local: tried
> [warn]
> /Users/karanalang/.activator/repository/org.apache.spark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
> [warn]  activator-local: tried
> [warn]
>

Re: Spark kafka integration issues

2016-09-14 Thread Cody Koeninger

Yeah, an updated version of that blog post is available at

https://github.com/koeninger/kafka-exactly-once

On Wed, Sep 14, 2016 at 11:35 AM, Mukesh Jha  wrote:
> Thanks for the reply Cody.
>
> I found the below article on the same, very helpful. Thanks for the details,
> much appreciated.
>
> http://blog.cloudera.com/blog/2015/03/exactly-once-spark-streaming-from-apache-kafka/
>
> On Tue, Sep 13, 2016 at 8:14 PM, Cody Koeninger  wrote:
>>
>> 1.  see
>> http://spark.apache.org/docs/latest/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers
>>  look for HasOffsetRange.  If you really want the info per-message
>> rather than per-partition, createRDD has an overload that takes a
>> messageHandler from MessageAndMetadata to whatever you need
>>
>> 2. createRDD takes type parameters for the key and value decoder, so
>> specify them there
>>
>> 3. you can use spark-streaming-kafka-0-8 against 0.9 or 0.10 brokers.
>> There is a spark-streaming-kafka-0-10 package with additional features
>> that only works on brokers 0.10 or higher.  A pull request for
>> documenting it has been merged, but not deployed.
>>
>> On Tue, Sep 13, 2016 at 6:46 PM, Mukesh Jha 
>> wrote:
>> > Hello fellow sparkers,
>> >
>> > I'm using spark to consume messages from kafka in a non streaming
>> > fashion.
>> > I'm suing the using spark-streaming-kafka-0-8_2.10 & sparkv2.0to do the
>> > same.
>> >
>> > I have a few queries for the same, please get back if you guys have
>> > clues on
>> > the same.
>> >
>> > 1) Is there anyway to get the have the topic and partition & offset
>> > information for each item from the KafkaRDD. I'm using the
>> > KafkaUtils.createRDD[String, String, StringDecoder, StringDecoder] to
>> > create
>> > my kafka RDD.
>> > 2) How to pass my custom Decoder instead of using the String or Byte
>> > decoder
>> > are there any examples for the same?
>> > 3) is there a newer version to consumer from kafka-0.10 & kafka-0.9
>> > clusters
>> >
>> > --
>> > Thanks & Regards,
>> >
>> > Mukesh Jha
>
>
>
>
> --
>
>
> Thanks & Regards,
>
> Mukesh Jha

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark kafka integration issues

2016-09-14 Thread Mukesh Jha

Thanks for the reply Cody.

I found the below article on the same, very helpful. Thanks for the
details, much appreciated.

http://blog.cloudera.com/blog/2015/03/exactly-once-spark-streaming-from-apache-kafka/

On Tue, Sep 13, 2016 at 8:14 PM, Cody Koeninger  wrote:

> 1.  see http://spark.apache.org/docs/latest/streaming-kafka-
> integration.html#approach-2-direct-approach-no-receivers
>  look for HasOffsetRange.  If you really want the info per-message
> rather than per-partition, createRDD has an overload that takes a
> messageHandler from MessageAndMetadata to whatever you need
>
> 2. createRDD takes type parameters for the key and value decoder, so
> specify them there
>
> 3. you can use spark-streaming-kafka-0-8 against 0.9 or 0.10 brokers.
> There is a spark-streaming-kafka-0-10 package with additional features
> that only works on brokers 0.10 or higher.  A pull request for
> documenting it has been merged, but not deployed.
>
> On Tue, Sep 13, 2016 at 6:46 PM, Mukesh Jha 
> wrote:
> > Hello fellow sparkers,
> >
> > I'm using spark to consume messages from kafka in a non streaming
> fashion.
> > I'm suing the using spark-streaming-kafka-0-8_2.10 & sparkv2.0to do the
> > same.
> >
> > I have a few queries for the same, please get back if you guys have
> clues on
> > the same.
> >
> > 1) Is there anyway to get the have the topic and partition & offset
> > information for each item from the KafkaRDD. I'm using the
> > KafkaUtils.createRDD[String, String, StringDecoder, StringDecoder] to
> create
> > my kafka RDD.
> > 2) How to pass my custom Decoder instead of using the String or Byte
> decoder
> > are there any examples for the same?
> > 3) is there a newer version to consumer from kafka-0.10 & kafka-0.9
> clusters
> >
> > --
> > Thanks & Regards,
> >
> > Mukesh Jha
>



-- 


Thanks & Regards,

*Mukesh Jha *

Re: Spark kafka integration issues

2016-09-13 Thread Cody Koeninger

1.  see 
http://spark.apache.org/docs/latest/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers
 look for HasOffsetRange.  If you really want the info per-message
rather than per-partition, createRDD has an overload that takes a
messageHandler from MessageAndMetadata to whatever you need

2. createRDD takes type parameters for the key and value decoder, so
specify them there

3. you can use spark-streaming-kafka-0-8 against 0.9 or 0.10 brokers.
There is a spark-streaming-kafka-0-10 package with additional features
that only works on brokers 0.10 or higher.  A pull request for
documenting it has been merged, but not deployed.

On Tue, Sep 13, 2016 at 6:46 PM, Mukesh Jha  wrote:
> Hello fellow sparkers,
>
> I'm using spark to consume messages from kafka in a non streaming fashion.
> I'm suing the using spark-streaming-kafka-0-8_2.10 & sparkv2.0to do the
> same.
>
> I have a few queries for the same, please get back if you guys have clues on
> the same.
>
> 1) Is there anyway to get the have the topic and partition & offset
> information for each item from the KafkaRDD. I'm using the
> KafkaUtils.createRDD[String, String, StringDecoder, StringDecoder] to create
> my kafka RDD.
> 2) How to pass my custom Decoder instead of using the String or Byte decoder
> are there any examples for the same?
> 3) is there a newer version to consumer from kafka-0.10 & kafka-0.9 clusters
>
> --
> Thanks & Regards,
>
> Mukesh Jha

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

RE: Spark Kafka Integration

RE: Spark Kafka Integration

Re: Spark Kafka Integration

Re: Spark Kafka Integration

RE: Spark Kafka Integration

RE: Spark Kafka Integration

Re: Spark Kafka Integration

Re: Spark Kafka Integration

RE: Spark Kafka Integration

Re: Spark Kafka Integration

Re: Spark Kafka Integration

Re: Spark-Kafka integration - build failing with sbt

Re: Spark-Kafka integration - build failing with sbt

Re: Spark-Kafka integration - build failing with sbt

Re: Spark-Kafka integration - build failing with sbt

Re: Spark kafka integration issues

Re: Spark kafka integration issues

Re: Spark kafka integration issues

18 matches

Site Navigation

Mail list logo

Footer information