[jira] [Updated] (HIVE-24128) transactions cannot recognize bucket file

richt richt (Jira) Mon, 07 Sep 2020 19:54:09 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-24128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


richt richt updated HIVE-24128:
-------------------------------
    Description: 
 
 * Error while compiling statement: FAILED: SemanticException [Error 10141]: 
Bucketed table metadata is not correct. Fix the metadata or don't use bucketed 
mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets 
for table dcp partition ods_load_date=20200701 is 2, whereas the number of 
files is 1
 * the trnasaaction table  manage file like below 
{code:java}
-rw-r--r--   3 hadoop supergroup          1 2020-09-08 10:20 
/usr/hive/warehouse/test_etl_dwd.db/dcp/ods_load_date=20200701/_orc_acid_version
drwxr-xr-x   - hadoop supergroup          0 2020-09-07 19:28 
/usr/hive/warehouse/test_etl_dwd.db/dcp/ods_load_date=20200701/base_0000021
-rw-r--r--   3 hadoop supergroup   15401449 2020-09-08 10:20 
/usr/hive/warehouse/test_etl_dwd.db/dcp/ods_load_date=20200701/bucket_00000
-rw-r--r--   3 hadoop supergroup   15408471 2020-09-08 10:20 
/usr/hive/warehouse/test_etl_dwd.db/dcp/ods_load_date=20200701/bucket_00001
{code}

 * it put the bucket file into.      dir  base**, 
 * but when I use merge  the table , hive will check the bucket file number , 
 * 
{code:java}
public static List<String> getBucketFilePathsOfPartition(
      Path location, ParseContext pGraphContext) throws SemanticException {
    List<String> fileNames = new ArrayList<String>();
    try {
      FileSystem fs = location.getFileSystem(pGraphContext.getConf());
      FileStatus[] files = fs.listStatus(new Path(location.toString()), 
FileUtils.HIDDEN_FILES_PATH_FILTER);
      if (files != null) {
        for (FileStatus file : files) {
          fileNames.add(file.getPath().toString());
        }
      }
    } catch (IOException e) {
      throw new SemanticException(e);
    }
    return fileNames;
  }
{code}
it only. check the file under the partition , not check the base** directory 

  was:
 
 * Error while compiling statement: FAILED: SemanticException [Error 10141]: 
Bucketed table metadata is not correct. Fix the metadata or don't use bucketed 
mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets 
for table dcp partition ods_load_date=20200701 is 2, whereas the number of 
files is 1
 * the trnasaaction table  manage file like below 
{code:java}
-rw-r--r--   3 hadoop supergroup          1 2020-09-08 10:20 
/usr/hive/warehouse/test_etl_dwd.db/dcp/ods_load_date=20200701/_orc_acid_version
drwxr-xr-x   - hadoop supergroup          0 2020-09-07 19:28 
/usr/hive/warehouse/test_etl_dwd.db/dcp/ods_load_date=20200701/base_0000021
-rw-r--r--   3 hadoop supergroup   15401449 2020-09-08 10:20 
/usr/hive/warehouse/test_etl_dwd.db/dcp/ods_load_date=20200701/bucket_00000
-rw-r--r--   3 hadoop supergroup   15408471 2020-09-08 10:20 
/usr/hive/warehouse/test_etl_dwd.db/dcp/ods_load_date=20200701/bucket_00001
{code}

 * it put the bucket file into.      dir  base**, 
 * but when I use merge  the table , hive will check the bucket file number , 
 * 
{code:java}
public static List<String> getBucketFilePathsOfPartition(      Path location, 
ParseContext pGraphContext) throws SemanticException {    List<String> 
fileNames = new ArrayList<String>();    try {      FileSystem fs = 
location.getFileSystem(pGraphContext.getConf());      FileStatus[] files = 
fs.listStatus(new Path(location.toString()), 
FileUtils.HIDDEN_FILES_PATH_FILTER);      if (files != null) {        for 
(FileStatus file : files) {          fileNames.add(file.getPath().toString());  
      }      }    } catch (IOException e) {      throw new 
SemanticException(e);    }    return fileNames;  }
{code}
it only. check the file under the partition , not check the base** directory 


> transactions cannot recognize bucket file 
> ------------------------------------------
>
>                 Key: HIVE-24128
>                 URL: https://issues.apache.org/jira/browse/HIVE-24128
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 3.1.1
>            Reporter: richt richt
>            Priority: Major
>
>  
>  * Error while compiling statement: FAILED: SemanticException [Error 10141]: 
> Bucketed table metadata is not correct. Fix the metadata or don't use 
> bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number 
> of buckets for table dcp partition ods_load_date=20200701 is 2, whereas the 
> number of files is 1
>  * the trnasaaction table  manage file like below 
> {code:java}
> -rw-r--r--   3 hadoop supergroup          1 2020-09-08 10:20 
> /usr/hive/warehouse/test_etl_dwd.db/dcp/ods_load_date=20200701/_orc_acid_version
> drwxr-xr-x   - hadoop supergroup          0 2020-09-07 19:28 
> /usr/hive/warehouse/test_etl_dwd.db/dcp/ods_load_date=20200701/base_0000021
> -rw-r--r--   3 hadoop supergroup   15401449 2020-09-08 10:20 
> /usr/hive/warehouse/test_etl_dwd.db/dcp/ods_load_date=20200701/bucket_00000
> -rw-r--r--   3 hadoop supergroup   15408471 2020-09-08 10:20 
> /usr/hive/warehouse/test_etl_dwd.db/dcp/ods_load_date=20200701/bucket_00001
> {code}
>  * it put the bucket file into.      dir  base**, 
>  * but when I use merge  the table , hive will check the bucket file number , 
>  * 
> {code:java}
> public static List<String> getBucketFilePathsOfPartition(
>       Path location, ParseContext pGraphContext) throws SemanticException {
>     List<String> fileNames = new ArrayList<String>();
>     try {
>       FileSystem fs = location.getFileSystem(pGraphContext.getConf());
>       FileStatus[] files = fs.listStatus(new Path(location.toString()), 
> FileUtils.HIDDEN_FILES_PATH_FILTER);
>       if (files != null) {
>         for (FileStatus file : files) {
>           fileNames.add(file.getPath().toString());
>         }
>       }
>     } catch (IOException e) {
>       throw new SemanticException(e);
>     }
>     return fileNames;
>   }
> {code}
> it only. check the file under the partition , not check the base** directory 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24128) transactions cannot recognize bucket file

Reply via email to