Thanks  Okumin.

I am new to Hive and Tez  and I have struggled to deploy a high-performance
Dockerized Hive setup. I followed the documentation for setting up a remote
Metastore. I have a single node with 32 GB of RAM and 8 cores, but I have a
dataset of about 2 GB (Iceberg table partitioned on one column). However,
when I run select queries, the performance has not been as fast as
expected. Could someone share some insights, especially regarding
hive-site.xml and Tez custom configuration?

Any help would be appreciated.

On Sun, Aug 4, 2024 at 4:46 PM Okumin <m...@okumin.com> wrote:

> Hi Clinton,
>
> I tested MERGE INTO with minimal reproduction. I saw the same error.
>
> ```
> CREATE TABLE src (col1 INT, col2 INT);
> CREATE TABLE dst (id BIGINT DEFAULT SURROGATE_KEY(), col1 INT, col2
> INT, PRIMARY KEY (id) DISABLE NOVALIDATE) STORED BY ICEBERG;
>
> MERGE INTO dst d USING src s ON s.col1 = d.col1
> WHEN MATCHED THEN UPDATE SET col2 = s.col2
> WHEN NOT MATCHED THEN INSERT (col1, col2) VALUES (s.col1, s.col2);
> ```
>
> The following query, which explicitly inserts `id`, succeeded on my
> machine. The default keyword is unlikely to work on INSERT on MERGE
> INTO. I've yet to investigate whether ANSI allows us to omit it.
>
> ```
> MERGE INTO dst d USING src s ON s.col1 = d.col1
> WHEN MATCHED THEN UPDATE SET col2 = s.col2
> WHEN NOT MATCHED THEN INSERT (id, col1, col2) VALUES (SURROGATE_KEY(),
> s.col1, s.col2);
> ```
>
> As another point, the SURROGATE_KEY might not work as you expected. It
> doesn't generate globally unique ids on my attempts.
>
> Regards,
> Okumin
>
> On Wed, Jul 31, 2024 at 4:54 PM clinton chikwata
> <clintonfun...@gmail.com> wrote:
> >
> > Dear Team,
> >
> > Any help will be much appreciated.
> >
> > Error SQL Error [40000] [42000]: Error while compiling statement:
> FAILED: SemanticException Schema of both sides of union should match.
> >
> > I have an ETL workload that stores data into temp_table with the schema
> as shown below.
> >
> > CREATE EXTERNAL TABLE IF NOT EXISTS temp_table (
> >     VC_ITEM_CODE STRING,
> >     VC_SUB_GROUP STRING,
> >     VC_PRODUCT_NAME STRING,
> >     VC_PRODUCT_UNIT STRING,
> >     VC_GROUP_CODE STRING,
> >     DT_VAT_START TIMESTAMP,
> >     VC_BAND_CODE STRING,
> >     VC_SEMI_BAND_CODE STRING,
> >     VC_DIVISIONS STRING,
> >     NU_UNIT_FACTOR DECIMAL(30, 0),
> >     VC_DIVISION_SEG_CODE STRING,
> >     VC_COLOR_COMB STRING,
> >     DT_MOD_DATE TIMESTAMP,
> >     VC_INACTIVE_PRODUCT STRING,
> >     RN DECIMAL(10, 0),
> >     country STRING
> >     )
> > STORED AS PARQUET
> > LOCATION 'S{path}'
> >
> > Then i want to load it to the final table
> >
> > CREATE TABLE product_dimension (
> >    `ID` BIGINT DEFAULT SURROGATE_KEY(),
> >     VC_ITEM_CODE STRING,
> >     VC_SUB_GROUP STRING,
> >     VC_PRODUCT_NAME STRING,
> >     VC_PRODUCT_UNIT STRING,
> >     VC_GROUP_CODE STRING,
> >     DT_VAT_START TIMESTAMP,
> >     VC_BAND_CODE STRING,
> >     VC_SEMI_BAND_CODE STRING,
> >     VC_DIVISIONS STRING,
> >     NU_UNIT_FACTOR DECIMAL(30, 0),
> >     VC_DIVISION_SEG_CODE STRING,
> >     VC_COLOR_COMB STRING,
> >     DT_MOD_DATE TIMESTAMP,
> >     VC_INACTIVE_PRODUCT STRING,
> >     RN DECIMAL(10, 0),
> >     country STRING,
> >     PRIMARY KEY (ID) DISABLE NOVALIDATE)
> > STORED BY ICEBERG;
> >
> > When I attempt to perform a merge operation on column  vc_item_code i
> get the error as shown above :
> >
> > MERGE
> > INTO
> > product_dimension AS c
> > USING (
> > SELECT
> > *
> > FROM
> > temp_table) AS s ON  s.vc_item_code = c.vc_item_code
> > AND s.country = c.country
> > WHEN MATCHED THEN
> > UPDATE
> > SET
> > vc_item_code = s.vc_item_code,
> > vc_sub_group = s.vc_sub_group,
> > vc_product_name = s.vc_product_name,
> > vc_product_unit = s.vc_product_unit,
> > vc_group_code = s.vc_group_code,
> > dt_vat_start = s.dt_vat_start,
> > vc_band_code = s.vc_band_code,
> > vc_semi_band_code = s.vc_semi_band_code,
> > vc_divisions = s.vc_divisions,
> > nu_unit_factor = s.nu_unit_factor,
> > vc_division_seg_code = s.vc_division_seg_code,
> > vc_color_comb = s.vc_color_comb,
> > dt_mod_date = s.dt_mod_date,
> > vc_inactive_product = s.vc_inactive_product,
> > rn = s.rn,
> > country = s.country
> > WHEN NOT MATCHED THEN
> > INSERT
> > (
> >     vc_item_code,
> > vc_sub_group,
> > vc_product_name,
> > vc_product_unit,
> > vc_group_code,
> > dt_vat_start,
> > vc_band_code,
> > vc_semi_band_code,
> > vc_divisions,
> > nu_unit_factor,
> > vc_division_seg_code,
> > vc_color_comb,
> > dt_mod_date,
> > vc_inactive_product,
> > rn,
> > country
> >     )
> > VALUES (
> > s.vc_item_code,
> > s.vc_sub_group,
> > s.vc_product_name,
> > s.vc_product_unit,
> > s.vc_group_code,
> > s.dt_vat_start,
> > s.vc_band_code,
> > s.vc_semi_band_code,
> > s.vc_divisions,
> > s.nu_unit_factor,
> > s.vc_division_seg_code,
> > s.vc_color_comb,
> > s.dt_mod_date,
> > s.vc_inactive_product,
> > s.rn,
> > s.country
> > );
> >
> > Warm Regards
>

Reply via email to