Hi Kabeer,

I have added both the dependency and tried too.
Just a version change, I have used *parquet-hadoop 1.8.1 *since *parquet-avro
*is* 1.8.1.*
*Looks like this *

* <parquet.version>1.8.1</parquet.version>*












*<dependency><groupId>org.apache.parquet</groupId><artifactId>parquet-avro</artifactId><version>${parquet.version}</version><!--
<scope>provided</scope>
--></dependency><dependency><groupId>org.apache.parquet</groupId><artifactId>parquet-hadoop</artifactId><version>
${parquet.version}  </version></dependency> *


Regards,
*Shahida R. Khan*


On Tue, Oct 15, 2019 at 7:12 PM Kabeer Ahmed <[email protected]> wrote:

> Thank you Shahida. Can you please confirm that you have included both the
> below dependencies and tried the build?
>
> If your build is missing parquet-hadoop, then the required class may not
> be found. If you have already included the below dependencies and still it
> doesnt work, I can upload a jar for you to try.
> <dependency>
> <groupId>org.apache.parquet</groupId>
> <artifactId>parquet-avro</artifactId>
> <version>${parquet.version}</version>
> <scope>provided</scope>
> </dependency>
>
> <dependency>
> <groupId>org.apache.parquet</groupId>
> <artifactId>parquet-hadoop</artifactId>
> <version>1.8.3</version>
> </dependency>
> On Oct 15 2019, at 2:28 pm, Shahida Khan <[email protected]>
> wrote:
> > Hi Kabeer,
> >
> > Thank you for quick response!
> > Also, our project already include the below dependency, I believe this
> should include "org.apache.parquet.parquet-hadoop"
> >
> >
> > <dependency>
> > <groupId>org.apache.parquet</groupId>
> > <artifactId>parquet-avro</artifactId>
> > <version>${parquet.version}</version>
> > </dependency>
> >
> >
> > I have even checked the ```jar -tvf shahida.jar | grep -i
> CompressionCodecName``` class is not available in the jar even after
> including in build.
> >
> > Strange is, I have even provided the parquet-avro jar via spark-submit,
> and it behave differently for 1.7 and 1.8
> > Seems like there is some configuration missing with respect to
> HoodieStorageConfig.PARQUET_COMPRESSION_CODEC.
> >
> >
> >
> >
> > Regards,
> > Shahida R. Khan
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Oct 15, 2019 at 4:24 PM Kabeer Ahmed <[email protected]
> (mailto:[email protected])> wrote:
> > > Shahida
> > >
> > > Welcome to Hudi. I am not an expert with DeltaStreamer as I do not use
> it. In general, I think this points to the issue with build of the fat jar.
> This looks to me that either you didnt build the fat jar to include all the
> dependencies or your class path didnt include the jar needed.
> > > For some reason I didnt receive the full stack trace attachment.
> Either you forgot to attach it or mail system blocked it.
> > > Can you please check:
> > > That your pom has dependency shown as below:
> > > <!--
> https://mvnrepository.com/artifact/org.apache.parquet/parquet-hadoop -->
> > > <dependency>
> > > <groupId>org.apache.parquet</groupId>
> > > <artifactId>parquet-hadoop</artifactId>
> > > <version>1.8.3</version>
> > > </dependency>
> > >
> > > Can you also run ```jar -tvf shahida.jar | grep -i
> CompressionCodecName ``` and let us know the output that you see.
> > > Once we have the answers to the above, we can see what is missing and
> address that hopefully.
> > > Kabeer.
> > > On Oct 15 2019, at 10:39 am, Shahida Khan <[email protected]
> (mailto:[email protected])> wrote:
> > > > Hi All,
> > > >
> > > > Hope you are doing well.
> > > > I am currently trying to implement the Hudi Utilities using Delta
> Streamer. Below is the command line configuration I am passing
> > > >
> > > > spark2-submit --master yarn --deploy-mode cluster --class
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
> /tmp/hudi-utilities-bundle-0.5.1-SNAPSHOT.jar --props
> /user/oozie/dataops/hoodie/config.properties --schemaprovider-class
> org.apache.hudi.utilities.schema.SchemaRegistryProvider --source-class
> org.apache.hudi.utilities.sources.AvroKafkaSource --source-ordering-field
> LastModified_dtmStamp
> > > > --target-base-path /tmp/hudi-deltastreamer-op_TEST --target-table
> testTableHoodie --op UPSERT --enable-hive-sync --storage-type MERGE_ON_READ
> > > >
> > > > Also, have attached the config file too.
> > > >
> > > > Unfortunately, while writing the files in parquet, it throws an
> exception as "java.lang.NoClassDefFoundError:
> org/apache/parquet/hadoop/metadata/CompressionCodecName"
> > > > Full Error Trace has been attached for your reference.
> > > >
> > > > There are few warnings with respect to configuration but not sure if
> that's the problem.
> > > >
> > > > I have tried giving the classpath as well. I am not sure what i am
> missing here.
> > > > It would be great if anybody could help me here.
> > > >
> > > > Hadoop version :- 2.6.0-cdh5.14.2
> > > > Spark version :- 2.3.0.cloudera2
> > > >
> > > >
> > > > Regards,
> > > > Shahida R. Khan
> > > > +91 9167538366
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
> > The information contained in this transmission may contain privileged
> and confidential information of Big Tree Entertainment Pvt Ltd, including
> information protected by privacy laws. It is intended only for the use of
> Big Tree Entertainment Pvt Ltd. If you are not the intended recipient, you
> are hereby notified that any review, dissemination, distribution, or
> duplication of this communication is strictly prohibited. If you are not
> the intended recipient, please contact the sender by reply email and
> destroy all copies of the original message. Although Big Tree Entertainment
> Pvt Ltd. has taken reasonable precautions to ensure no viruses are present
> in this email, Big Tree Entertainment Pvt Ltd. cannot accept responsibility
> for any loss or damage arising from the use of this email or attachments.
> Computer viruses can be transmitted via email. Recipient should check the
> email and any attachments for the presence of viruses before using them.
> Any views or opinions are solely those of th
> e author and do not necessarily represent those of Big Tree Entertainment
> Pvt Ltd.
>

-- 










The information contained in this transmission may contain 
privileged and confidential information of Big Tree Entertainment Pvt Ltd, 
including information protected by privacy laws. It is intended only for 
the use of Big Tree Entertainment Pvt Ltd. If you are not the intended 
recipient, you are hereby notified that any review, dissemination, 
distribution, or duplication of this communication is strictly prohibited. 
If you are not the intended recipient, please contact the sender by reply 
email and destroy all copies of the original message. Although Big Tree 
Entertainment Pvt Ltd. has taken reasonable precautions to ensure no 
viruses are present in this email, Big Tree Entertainment Pvt Ltd. cannot 
accept responsibility for any loss or damage arising from the use of this 
email or attachments. Computer viruses can be transmitted via email. 
Recipient should check the email and any attachments for the presence of 
viruses before using them. Any views or opinions are solely those of the 
author and do not necessarily represent those of Big Tree Entertainment Pvt 
Ltd.

Reply via email to