Ouch.. We dropped support for Hive 1.x recently. But Hive 1.2.x might still
work. Is there a possibility of going one minor version up?

Balaji and Nishith know the gory details, and possible deal with. Hive 1.x
now and then.
Folks, Is there any chance to make master work with Hive 1.x with some
custom changes?

On Mon, Oct 14, 2019 at 10:11 PM Gurudatt Kulkarni <guruak...@gmail.com>
wrote:

> Hi Vinoth,
>
> Thank you for the quick response, but using the master branch would mean
> building for Hive 2.X, but we are still working on Hive 1.1.0 :(
>
>
> On Mon, Oct 14, 2019 at 7:57 PM Vinoth Chandar <vin...@apache.org> wrote:
>
> > Hi Gurudatt,
> >
> > Thanks for reporting this. This seems like a class mismatch issue (the
> > particular stack trace). master and the next org.apache.hudi release has
> > tons of fixes around this. Could you give master branch a shot by
> building
> > it yourself?
> > <
> >
> http://mail-archives.apache.org/mod_mbox/hudi-dev/201906.mbox/%3CCADTZSaV9GO=3uymyzy6vidjo_va-_98tqalokmmoakbcc1g...@mail.gmail.com%3E
> > >
> >
> > To achieve what you are trying to do, please see this old thread.
> >
> >
> http://mail-archives.apache.org/mod_mbox/hudi-dev/201906.mbox/%3CCADTZSaV9GO=3uymyzy6vidjo_va-_98tqalokmmoakbcc1g...@mail.gmail.com%3E
> >
> > You also need to set the following properties.
> >
> >
> >
> hoodie.datasource.write.keygenerator.class=org.apache.hudi.NonpartitionedKeyGenerator
> >
> >
> >
> hoodie.datasource.hive_sync.partition_extractor_class=org.apache.hudi.hive.NonPartitionedExtractor
> >
> >
> > Balaji ,Can we faq your answer on that thread, since this an often
> > asked question?
> >
> >
> > Thanks
> >
> > vinoth
> >
> >
> >
> > On Mon, Oct 14, 2019 at 4:58 AM Gurudatt Kulkarni <guruak...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > I am using HoodieDeltaStreamer (hoodie-0.4.7) to migrate a small table.
> > The
> > > data is being written successfully in parquet format but the hive sync
> > > fails.
> > >
> > > Here's the Stacktrace.
> > >
> > > 19/10/14 17:02:12 INFO metastore.ObjectStore: Setting MetaStore object
> > > pin classes with
> > >
> > >
> >
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> > > 19/10/14 17:02:12 ERROR yarn.ApplicationMaster: User class threw
> > > exception: java.lang.ClassCastException:
> > > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore
> > > cannot be cast to
> > >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.PartitionExpressionProxy
> > > java.lang.ClassCastException:
> > > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore
> > > cannot be cast to
> > >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.PartitionExpressionProxy
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.ObjectStore.createExpressionProxy(ObjectStore.java:367)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.ObjectStore.initialize(ObjectStore.java:345)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.ObjectStore.setConf(ObjectStore.java:298)
> > >         at
> > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> > >         at
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:60)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:69)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:682)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:660)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:709)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:508)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6481)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:207)
> > >         at
> > >
> >
> com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:187)
> > >         at
> > > com.uber.hoodie.hive.HoodieHiveClient.<init>(HoodieHiveClient.java:102)
> > >         at
> com.uber.hoodie.hive.HiveSyncTool.<init>(HiveSyncTool.java:61)
> > >         at
> > >
> >
> com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer.syncHive(HoodieDeltaStreamer.java:328)
> > >         at
> > >
> >
> com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:298)
> > >         at
> > >
> >
> com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:469)
> > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >         at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > >         at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >         at java.lang.reflect.Method.invoke(Method.java:498)
> > >         at
> > >
> >
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:688)
> > >
> > >
> > > Here's are the properties that I am using,
> > >
> > >
> > > ```
> > >
> > > hoodie.upsert.shuffle.parallelism=2
> > > hoodie.insert.shuffle.parallelism=2
> > > hoodie.bulkinsert.shuffle.parallelism=2
> > >
> > > # Key fields, for kafka example
> > > hoodie.datasource.write.recordkey.field=<primary_key>
> > > *hoodie.datasource.write.partitionpath.field=*
> > > # schema provider configs
> > > hoodie.deltastreamer.schemaprovider.registry.url=
> > > http://localhost:8081/subjects/schema_name/versions/latest
> > > # Kafka Source
> > > hoodie.datasource.hive_sync.database=default
> > > hoodie.datasource.hive_sync.table=table_name
> > > hoodie.datasource.hive_sync.jdbcurl=jdbc:hive2://localhost:10000
> > > *hoodie.datasource.hive_sync.partition_fields=*
> > >
> > > hoodie.deltastreamer.source.kafka.topic=topic_name
> > > #Kafka props
> > > metadata.broker.list=localhost:9092
> > > auto.offset.reset=smallest
> > > schema.registry.url=http://localhost:8081
> > >
> > > ```
> > >
> > >
> > > The table does not have partitons, hence i have kept
> > > *hoodie.datasource.write.partitionpath.field *blank,
> > >
> > > so it is writing it to `default` directory.
> > >
> > > Also, *hoodie.datasource.hive_sync.partition_fields *property is left
> > > blank for the same reason.
> > >
> > >
> > > Regards,
> > >
> > > Gurudatt
> > >
> >
>

Reply via email to