without commenting on any other part of this, note that it was in some hive
commit operations where a race condition in rename surfaced
https://issues.apache.org/jira/browse/HADOOP-16721

if you get odd errors about parent dirs not existing during renames,
that'll be it...Upgrade to Hadoop-3.3.1 binaries to fix


On Thu, 11 Nov 2021 at 12:01, Bode, Meikel, NMA-CFD <
meikel.b...@bertelsmann.de> wrote:

> Hi all,
>
>
>
> I now have some more input related to the issues I face at the moment:
>
>
>
> When I try to UPDATE an external table via JDBC connection to HiveThrift2
> server I get the following exception:
>
>
>
> java.lang.UnsupportedOperationException: UPDATE TABLE is not supported
> temporarily.
>
>
>
> Whey doing an DELETE I see:
>
>
>
> org.apache.spark.sql.AnalysisException: DELETE is only supported with v2
> tables.
>
>
>
> INSERT is working as expected.
>
>
>
> We are using Spark 3.1.2 with Hadoop 3.2.0 and an external Hive 3.0.0
> metastore on K8S.
>
> Warehouse dir is located at AWS s3 attached using protocol s3a.
>
>
>
> I learned so far that  that we need to use an ACID compatible file format
> for external tables such as ORC order DELTA.
>
> In addition to that we would need to set some ACID related properties
> either as first commands after session creation or via appropriate
> configuration files:
>
>
>
> SET hive.support.concurrency=true;
>
> SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
>
> SET hive.enforce.sorting=true;
>
> SET hive.enforce.bucketing=true;
>
> SET hive.exec.dynamic.partition.mode=nostrict;
>
> SET hive.compactor.initiator.on=true;
>
> SET hive.compactor.worker.threads=1;
>
>
>
> Now, when I try to create the following table:
>
>
>
> create external table acidtab (id string, val string)
>
>             stored as ORC location '/data/acidtab.orc'
>
>             tblproperties ('transactional'='true');
>
>
>
> I see the following exception:
>
>
>
> org.apache.spark.sql.AnalysisException:
> org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:The
> table must be stored using an ACID compliant format (such as ORC):
> default.acidtab)
>
>
>
> Even when I try to create the file in ORC format the exception makes the
> suggestion to use ORC as it is required for ACID compliance.
>
>
>
> Another point is that external tables are not getting deleted via DROP
> TABLE command. The only are being removed from the metastore but they
> remain physically available at their s3 bucket.
>
>
>
> I tried with:
>
>
>
> SET `hive.metastore.thrift.delete-files-on-drop`=true;
>
>
>
> And also by setting:
>
>
>
> TBLPROPERTIES ('external.table.purge'='true')
>
>
>
>
>
> Any help on these issues would be very appreciated!
>
>
>
> Many thanks,
>
> Meikel Bode
>
>
>
> *From:* Bode, Meikel, NMA-CFD <meikel.b...@bertelsmann.de>
> *Sent:* Mittwoch, 10. November 2021 08:23
> *To:* user <u...@spark.apache.org>; dev <dev@spark.apache.org>
> *Subject:* HiveThrift2 ACID Transactions?
>
>
>
> Hi all,
>
>
>
> We want to use apply INSERTS, UPDATE, and DELETE operations on tables
> based on parquet or ORC files served by thrift2.
>
> Actually its unclear whether we can enable them and where.
>
>
>
> At the moment, when executing UPDATE or DELETE operations those are
> getting blocked.
>
>
>
> Anyone out who uses ACID transactions in combination with thrift2?
>
>
>
> Best,
>
> Meikel
>

Reply via email to