[ 
https://issues.apache.org/jira/browse/HDDS-6585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523283#comment-17523283
 ] 

Ayush Saxena commented on HDDS-6585:
------------------------------------

The rename is failing here:
{noformat}
org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source 
[ofs://ozone1/ozone/hivetest/tmp/malus/data/Complex_data_type/data1.txt]
 to destination 
[ofs://ozone1/warehouse/tablespace/external/hive/malus1.db/table_stgejf3jlknmp/filename=1/data1.txt]
{noformat}
Most likely because rename isn’t supported across buckets. {*}Keep you source 
file in the same bucket as the hive warehouse{*}, or rather than creating an 
empty external table and then doing a load overwrite try setting the location 
of the external table itself as the file location while creating as the 
original file location.

If none of the above workarounds are feasible and you want to run the same 
query. I am not sure if you can do much in the ozone code. The most I can think 
of is having a client config to introduce a fallback in rename. In case rename 
is called across buckets, rather than throwing an exception do a copy and then 
delete the source file post successful copy. This might sound good, but in case 
there are lots of files in a directory or file sizes are too huge, might not be 
feasible in that case.

Else try some stuff in the Hive Code: AFAIK the needCopy method indeed does 
this fallback stuff, that means rather than renaming, do an actual copy. Here:
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4847-L4848]


This must be getting called in this case as well (I suppose), I didn’t check in 
actual, or we can add some check like that, if not. But this only check if the 
FileSystem is same or not.
 
{code:java}
/**
 * If moving across different FileSystems or differnent encryption zone, need 
to do a File copy instead of rename.
 * TODO- consider if need to do this for different file authority.
 * @throws HiveException
*/
static private boolean needToCopy(final HiveConf conf, Path srcf, Path destf, 
FileSystem srcFs,
FileSystem destFs, String configuredOwner, boolean isManaged) throws 
HiveException {
//Check if different FileSystems
if (!FileUtils.equalsFileSystem(srcFs, destFs)) 
{ return true; }
....
{code}
Ref: 
[https://github.com/apache/hive/blob/335749a33c02ec0dc7ebadb9eda101d174ad5faf/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L4972]
So, here things can be explicitly handled here for ozone. rather than checking 
only for filesystem uri, check if they are in different buckets or not if the 
filesystem is ozone. I guess OfsPath has a util method something like 
isInSameBucket, that can be used. But as of now Hive doesn’t have a dependency 
on ozone, so not sure how much consensus or effort it would be to get it added 
there....

> [Hive] [OFS] Execution error org.apache.hadoop.hive.ql.exec.MoveTask with 
> partition
> -----------------------------------------------------------------------------------
>
>                 Key: HDDS-6585
>                 URL: https://issues.apache.org/jira/browse/HDDS-6585
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: Ozone Manager
>    Affects Versions: 1.3.0
>            Reporter: Soumitra Sulav
>            Priority: Major
>
> Hive runtime error observed with load in the table with partition
> {code:java}
> jdbc:hive2://quasar-fcurus-7.quasar-fcurus> LOAD DATA INPATH 
> 'ofs://ozone1/ozone/hivetest/tmp/malus/data/Complex_data_type/data1.txt' 
> OVERWRITE INTO TABLE table_stgejf3jlknmp PARTITION (filename = '1');
> going to print operations logs
> printed operations logs
> Getting log thread is interrupted, since query is done!
> INFO  : Compiling 
> command(queryId=hive_20220223153616_d866979f-bc94-4a71-9866-0e423cfef9a1): 
> LOAD DATA INPATH 
> 'ofs://ozone1/ozone/hivetest/tmp/malus/data/Complex_data_type/data1.txt' 
> OVERWRITE INTO TABLE table_stgejf3jlknmp PARTITION (filename = '1')
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Created Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20220223153616_d866979f-bc94-4a71-9866-0e423cfef9a1); 
> Time taken: 0.155 seconds
> INFO  : Executing 
> command(queryId=hive_20220223153616_d866979f-bc94-4a71-9866-0e423cfef9a1): 
> LOAD DATA INPATH 
> 'ofs://ozone1/ozone/hivetest/tmp/malus/data/Complex_data_type/data1.txt' 
> OVERWRITE INTO TABLE table_stgejf3jlknmp PARTITION (filename = '1')
> INFO  : Starting task [Stage-0:MOVE] in serial mode
> INFO  : Loading data to table malus1.table_stgejf3jlknmp partition 
> (filename=1) from 
> ofs://ozone1/ozone/hivetest/tmp/malus/data/Complex_data_type/data1.txt
> ERROR : FAILED: Execution Error, return code 40000 from 
> org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source 
> ofs://ozone1/ozone/hivetest/tmp/malus/data/Complex_data_type/data1.txt to 
> destination 
> ofs://ozone1/warehouse/tablespace/external/hive/malus1.db/table_stgejf3jlknmp/filename=1/data1.txt
> INFO  : Completed executing 
> command(queryId=hive_20220223153616_d866979f-bc94-4a71-9866-0e423cfef9a1); 
> Time taken: 0.179 seconds
> INFO  : OK
> Error: Error while compiling statement: FAILED: Execution Error, return code 
> 40000 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source 
> ofs://ozone1/ozone/hivetest/tmp/malus/data/Complex_data_type/data1.txt to 
> destination 
> ofs://ozone1/warehouse/tablespace/external/hive/malus1.db/table_stgejf3jlknmp/filename=1/data1.txt
>  (state=08S01,code=40000)
> java.sql.SQLException: Error while compiling statement: FAILED: Execution 
> Error, return code 40000 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable 
> to move source 
> ofs://ozone1/ozone/hivetest/tmp/malus/data/Complex_data_type/data1.txt to 
> destination 
> ofs://ozone1/warehouse/tablespace/external/hive/malus1.db/table_stgejf3jlknmp/filename=1/data1.txt
>     at 
> org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:401)
>     at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:266)
>     at org.apache.hive.beeline.Commands.executeInternal(Commands.java:1007)
>     at org.apache.hive.beeline.Commands.execute(Commands.java:1217)
>     at org.apache.hive.beeline.Commands.sql(Commands.java:1146)
>     at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1499)
>     at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1357)
>     at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1136)
>     at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1084)
>     at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546)
>     at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>     at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:232) {code}
> Steps to reproduce :
> {code:java}
> create external table table_stgejf3jlknmp (   k1 STRING,   f1 STRING,   f2 
> STRING,   sequence_num BIGINT,   create_bsk BIGINT,   opcode STRING,   
> create_ts STRING,   part1 STRING ) PARTITIONED BY (filename STRING) ROW 
> FORMAT DELIMITED FIELDS TERMINATED BY '|';
> LOAD DATA INPATH 
> 'ofs://ozone1/ozone/hivetest/tmp/malus/data/Complex_data_type/data1.txt' 
> OVERWRITE INTO TABLE table_stgejf3jlknmp PARTITION (filename = '1'){code}
> Keep file 
> '{{{}ofs://ozone1/ozone/hivetest/tmp/malus/data/Complex_data_type/data1.txt{}}}'
>  ready with below content :
> {code:java}
> I1|A1|B1|1|100| |2014-09-01 00:00:00|P1
> I2|A2|B2|1|100| |2014-09-01 00:00:00|P1
> I3|A3|B3|1|100| |2014-09-01 00:00:00|P1
> I4|A4|B4|1|100| |2014-09-01 00:00:00|P1
> J1|A1|B1|1|100| |2014-09-01 00:00:00|P2
> J2|A2|B2|1|100| |2014-09-01 00:00:00|P2
> J3|A3|B3|1|100| |2014-09-01 00:00:00|P2
> J4|A4|B4|1|100| |2014-09-01 00:00:00|P2{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to