Re: multi-partitioned hudi table | partitions not created

SATISH SIDNAKOPPA Mon, 29 Apr 2019 23:07:18 -0700

Hi Vinoth,

I created the multi_part as below.


in dataset ---> concat('part1=',SUBSTR(emp_name,1,1),'/part2=','2018') as
part_col
in spark.write hud set ------>
.option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY,"part_col")

files in hdfs


alter table hudi.emp_multi_partkey add partition(part1='A',part2='2018') ;




On Mon, Apr 29, 2019 at 8:30 PM Vinoth Chandar <[email protected]> wrote:

> Hi Satish,
>
> Thats because the default KeyGenerator class only reads in a single field
> to partition on. What you are expecting is a composite key.
>
> Nishith has one in the test suite PR
>
> https://github.com/apache/incubator-hudi/pull/623/files#diff-8814d5eb596f19bc9a87e419453fd7c8
>
> We plan to add this to the main code. For now, you can copy the class and
> see if solves your need? KeyGenerator is pluggable anyway
>
> Thanks
> Vinoth
>
> On Mon, Apr 29, 2019 at 7:20 AM SATISH SIDNAKOPPA <
> [email protected]> wrote:
>
> > Hi Team,
> >
> >
> > I have to store data by department and region.
> > /dept=HR/region=AP
> > /dept=OPS/region=AP
> > /dept=HR/region=SA
> > /dept=OPS/region=SA
> >
> > so partitioned table created will have multi-keys
> >
> >
> > I tried passing value as comma separated(dept,region)
> > .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY,"dept,region")
> >
> > and dot separated,
> > .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY,"dept.region")
> >
> > but the partitions were not created in hdfs.All the data added to default
> > partition.
> >
> >
> > Could you guide in format of passing the multi-partitions to spark write
> > hudi dataset.
> >
> > regards
> > Satish S
> >
>

Re: multi-partitioned hudi table | partitions not created

Reply via email to