RE: Re: Using Avro file format with SparkSQL

Morven Huang Mon, 14 Feb 2022 23:14:21 -0800

Hi Steve, 

You’re correct about the '--packages' option, seems my memory does not serve me 
well :)


On 2022/02/15 07:04:27 Stephen Coy wrote:
> Hi Morven,
> 
> We use —packages for all of our spark jobs. Spark downloads the specified jar 
> and all of its dependencies from a Maven repository.
> 
> This means we never have to build fat or uber jars.
> 
> It does mean that the Apache Ivy configuration has to be set up correctly 
> though.
> 
> Cheers,
> 
> Steve C
> 
> > On 15 Feb 2022, at 5:58 pm, Morven Huang <mo...@gmail.com> wrote:
> >
> > I wrote a toy spark job and ran it within my IDE, same error if I don’t add 
> > spark-avro to my pom.xml. After putting spark-avro dependency to my 
> > pom.xml, everything works fine.
> >
> > Another thing is, if my memory serves me right, the spark-submit options 
> > for extra jars is ‘--jars’ , not ‘--packages’.
> >
> > Regards,
> >
> > Morven Huang
> >
> >
> > On 2022/02/10 03:25:28 "Karanika, Anna" wrote:
> >> Hello,
> >>
> >> I have been trying to use spark SQL’s operations that are related to the 
> >> Avro file format,
> >> e.g., stored as, save, load, in a Java class but they keep failing with 
> >> the following stack trace:
> >>
> >> Exception in thread "main" org.apache.spark.sql.AnalysisException:  Failed 
> >> to find data source: avro. Avro is built-in but external data source 
> >> module since Spark 2.4. Please deploy the application as per the 
> >> deployment section of "Apache Avro Data Source Guide".
> >>        at 
> >> org.apache.spark.sql.errors.QueryCompilationErrors$.failedToFindAvroDataSourceError(QueryCompilationErrors.scala:1032)
> >>        at 
> >> org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:666)
> >>        at 
> >> org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:720)
> >>        at 
> >> org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:852)
> >>        at 
> >> org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:256)
> >>        at 
> >> org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:239)
> >>        at xsys.fileformats.SparkSQLvsAvro.main(SparkSQLvsAvro.java:57)
> >>        at 
> >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> >> Method)
> >>        at 
> >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
> >>        at 
> >> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>        at java.base/java.lang.reflect.Method.invoke(Method.java:564)
> >>        at 
> >> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
> >>        at 
> >> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
> >>        at 
> >> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> >>        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> >>        at 
> >> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> >>        at 
> >> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
> >>        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
> >>        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> >>
> >> For context, I am invoking spark-submit and adding arguments --packages 
> >> org.apache.spark:spark-avro_2.12:3.2.0.
> >> Yet, Spark responds as if the dependency was not added.
> >> I am running spark-v3.2.0 (Scala 2.12).
> >>
> >> On the other hand, everything works great with spark-shell or spark-sql.
> >>
> >> I would appreciate any advice or feedback to get this running.
> >>
> >> Thank you,
> >> Anna
> >>
> >>
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> >
> 
> This email contains confidential information of and is the copyright of 
> Infomedia. It must not be forwarded, amended or disclosed without consent of 
> the sender. If you received this message by mistake, please advise the sender 
> and delete all copies. Security of transmission on the internet cannot be 
> guaranteed, could be infected, intercepted, or corrupted and you should 
> ensure you have suitable antivirus protection in place. By sending us your or 
> any third party personal details, you consent to (or confirm you have 
> obtained consent from such third parties) to Infomedia’s privacy policy. 
> http://www.infomedia.com.au/privacy-policy/
> 
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

RE: Re: Using Avro file format with SparkSQL

Reply via email to