Hi Vinoth,

Thank you for the quick response, but using the master branch would mean
building for Hive 2.X, but we are still working on Hive 1.1.0 :(


On Mon, Oct 14, 2019 at 7:57 PM Vinoth Chandar <vin...@apache.org> wrote:

> Hi Gurudatt,
>
> Thanks for reporting this. This seems like a class mismatch issue (the
> particular stack trace). master and the next org.apache.hudi release has
> tons of fixes around this. Could you give master branch a shot by building
> it yourself?
> <
> http://mail-archives.apache.org/mod_mbox/hudi-dev/201906.mbox/%3CCADTZSaV9GO=3uymyzy6vidjo_va-_98tqalokmmoakbcc1g...@mail.gmail.com%3E
> >
>
> To achieve what you are trying to do, please see this old thread.
>
> http://mail-archives.apache.org/mod_mbox/hudi-dev/201906.mbox/%3CCADTZSaV9GO=3uymyzy6vidjo_va-_98tqalokmmoakbcc1g...@mail.gmail.com%3E
>
> You also need to set the following properties.
>
>
> hoodie.datasource.write.keygenerator.class=org.apache.hudi.NonpartitionedKeyGenerator
>
>
> hoodie.datasource.hive_sync.partition_extractor_class=org.apache.hudi.hive.NonPartitionedExtractor
>
>
> Balaji ,Can we faq your answer on that thread, since this an often
> asked question?
>
>
> Thanks
>
> vinoth
>
>
>
> On Mon, Oct 14, 2019 at 4:58 AM Gurudatt Kulkarni <guruak...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I am using HoodieDeltaStreamer (hoodie-0.4.7) to migrate a small table.
> The
> > data is being written successfully in parquet format but the hive sync
> > fails.
> >
> > Here's the Stacktrace.
> >
> > 19/10/14 17:02:12 INFO metastore.ObjectStore: Setting MetaStore object
> > pin classes with
> >
> >
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> > 19/10/14 17:02:12 ERROR yarn.ApplicationMaster: User class threw
> > exception: java.lang.ClassCastException:
> > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore
> > cannot be cast to
> > com.uber.hoodie.org.apache.hadoop_hive.metastore.PartitionExpressionProxy
> > java.lang.ClassCastException:
> > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore
> > cannot be cast to
> > com.uber.hoodie.org.apache.hadoop_hive.metastore.PartitionExpressionProxy
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.ObjectStore.createExpressionProxy(ObjectStore.java:367)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.ObjectStore.initialize(ObjectStore.java:345)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.ObjectStore.setConf(ObjectStore.java:298)
> >         at
> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> >         at
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:60)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:69)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:682)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:660)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:709)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:508)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6481)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:207)
> >         at
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:187)
> >         at
> > com.uber.hoodie.hive.HoodieHiveClient.<init>(HoodieHiveClient.java:102)
> >         at com.uber.hoodie.hive.HiveSyncTool.<init>(HiveSyncTool.java:61)
> >         at
> >
> com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer.syncHive(HoodieDeltaStreamer.java:328)
> >         at
> >
> com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:298)
> >         at
> >
> com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:469)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >         at java.lang.reflect.Method.invoke(Method.java:498)
> >         at
> >
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:688)
> >
> >
> > Here's are the properties that I am using,
> >
> >
> > ```
> >
> > hoodie.upsert.shuffle.parallelism=2
> > hoodie.insert.shuffle.parallelism=2
> > hoodie.bulkinsert.shuffle.parallelism=2
> >
> > # Key fields, for kafka example
> > hoodie.datasource.write.recordkey.field=<primary_key>
> > *hoodie.datasource.write.partitionpath.field=*
> > # schema provider configs
> > hoodie.deltastreamer.schemaprovider.registry.url=
> > http://localhost:8081/subjects/schema_name/versions/latest
> > # Kafka Source
> > hoodie.datasource.hive_sync.database=default
> > hoodie.datasource.hive_sync.table=table_name
> > hoodie.datasource.hive_sync.jdbcurl=jdbc:hive2://localhost:10000
> > *hoodie.datasource.hive_sync.partition_fields=*
> >
> > hoodie.deltastreamer.source.kafka.topic=topic_name
> > #Kafka props
> > metadata.broker.list=localhost:9092
> > auto.offset.reset=smallest
> > schema.registry.url=http://localhost:8081
> >
> > ```
> >
> >
> > The table does not have partitons, hence i have kept
> > *hoodie.datasource.write.partitionpath.field *blank,
> >
> > so it is writing it to `default` directory.
> >
> > Also, *hoodie.datasource.hive_sync.partition_fields *property is left
> > blank for the same reason.
> >
> >
> > Regards,
> >
> > Gurudatt
> >
>

Reply via email to