[ 
https://issues.apache.org/jira/browse/IMPALA-10042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170393#comment-17170393
 ] 

kishorekumar kuruguntla edited comment on IMPALA-10042 at 8/7/20, 5:53 PM:
---------------------------------------------------------------------------

 

Thanks  [~tarmstrong]  for your reply . I updated my responses in blue font.

does "invalidate metadata <dbname.table_name>" help?

{color:#0747a6}I tried multiple times  running Invalidate metadata and Refresh 
<table_name>  , it  doesn't help.( Still getting Stale metadata error){color}

Would also be helpful to understand how that file was ingested, if files are 
overwritten in-place there are more issues related to caching that are possible 
(we've fixed all the known ones in later versions though). We had some related 
issues due to IMPALA-8561

{color:#0747a6}Files are ingested from Hive Beeline.{color}
 {color:#0747a6}As a part of our ETL ,we create a staging table and copy data 
from stage table to final table using LOAD INPATH command (Overwritten 
in-place),all these commands are executed in Hive Beeline.{color}
 {color:#0747a6}We are using hive for our ETL process and impala for Dashboard 
queries.{color}
 {color:#0747a6}Once our ETL process is finished,we run below commands {color}
 {color:#0747a6}1.Invalidate metadata {color}
 {color:#0747a6}2.Refresh {color}
 {color:#0747a6}3.Compute stats.{color}
 {color:#0747a6}Both Invalidate metadata & Refresh executes Successfully , but 
Compute stats fail with Error (This could be due to stale metadata){color}

{color:#0747a6}We are loading 134 tables, this issue occurs for 2 or 3 random 
tables at random days. {color}
 {color:#0747a6}Interesting point is  recently we upgraded to impala 3.2.0 and 
we are facing this issue.{color}

{color:#0747a6}*In order to fix this issue , I have to run below commands in 
impala-shell :*{color}
 {color:#0747a6}ALTER TABLE <db_name>.<table_name> RENAME TO 
<db_name>.<table_name>_stage;{color}
 {color:#0747a6}CREATE TABLE <db_name>.<table_name> STORED AS PARQUET AS SELECT 
* FROM <db_name>.<table_name>_stage;{color}
 {color:#0747a6}drop table <db_name>.<table_name>_stage;{color}
 {color:#0747a6}compute stats <db_name>.<table_name>;{color}

{color:#0747a6}Thanks for sharing info about IMPALA-8561 . i feel like we are 
facing same  issue( mentioned in workaround section).{color}

 

You could also help narrow down the issue by looking at the file size in "show 
files in <tablename>" in Impala and comparing it to the file size in HDFS - it 
will likely be different, which would explain why the version number - the last 
few bytes in the file - doesn't match.

{color:#0747a6}As of now ,i Overwritten the table files.{color}

{color:#0747a6}I am sure , this sure might occur in 1 or 2 days so that next 
time i will watch-out for file sizes in impala & HDFS{color}

{color:#0747a6} {color}

 

 


was (Author: kishore_k):
 

Thanks  [~tarmstrong]  for your reply . I updated my responses in blue font.

does "invalidate metadata <dbname.table_name>" help?

{color:#0747a6}I tried multiple times  running Invalidate metadata and Refresh 
<table_name>  , it  doesn't help.( Still getting Stale metadata error){color}

Would also be helpful to understand how that file was ingested, if files are 
overwritten in-place there are more issues related to caching that are possible 
(we've fixed all the known ones in later versions though). We had some related 
issues due to IMPALA-8561

{color:#0747a6}Files are ingested from Hive Beeline.{color}
{color:#0747a6}As a part of our ETL ,we create a staging table and copy data 
from stage table to final table using LOAD INPATH command (Overwritten 
in-place),all these commands are executed in Hive Beeline.{color}
{color:#0747a6}We are using hive for our ETL process and impala for Dashboard 
queries.{color}
{color:#0747a6}Once our ETL process is finished,we run below commands {color}
{color:#0747a6}1.Invalidate metadata {color}
{color:#0747a6}2.Refresh {color}
{color:#0747a6}3.Compute stats.{color}
{color:#0747a6}Both Invalidate metadata & Refresh executes Successfully , but 
Compute stats fail with Error (This could be due to stale metadata){color}

{color:#0747a6}We are loading 134 tables, this issue occurs for 2 or 3 random 
tables at random days. {color}
{color:#0747a6}Interesting point is  recently we upgraded to impala 3.2.0 and 
we are facing this issue.{color}


{color:#0747a6}*In order to fix this issue , I have to run below commands in 
impala-shell :*{color}
{color:#0747a6}ALTER TABLE <db_name>.<table_name> RENAME TO 
<db_name>.<table_name>_stage;{color}
{color:#0747a6}CREATE TABLE <db_name>.<table_name> AS SELECT * FROM 
<db_name>.<table_name>_stage;{color}
{color:#0747a6}drop table <db_name>.<table_name>_stage;{color}
{color:#0747a6}compute stats <db_name>.<table_name>;{color}

{color:#0747a6}Thanks for sharing info about IMPALA-8561 . i feel like we are 
facing same  issue( mentioned in workaround section).{color}

 

You could also help narrow down the issue by looking at the file size in "show 
files in <tablename>" in Impala and comparing it to the file size in HDFS - it 
will likely be different, which would explain why the version number - the last 
few bytes in the file - doesn't match.

{color:#0747a6}As of now ,i Overwritten the table files.{color}

{color:#0747a6}I am sure , this sure might occur in 1 or 2 days so that next 
time i will watch-out for file sizes in impala & HDFS{color}

{color:#0747a6} {color}

 

 

> ERROR: File has an invalid version number:  This could be due to stale 
> metadata. Try running "refresh <Table_name>".
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-10042
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10042
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend, Catalog
>    Affects Versions: Impala 3.2.0
>            Reporter: kishorekumar kuruguntla
>            Priority: Major
>         Attachments: image-2020-08-03-10-31-37-405.png
>
>
> Hi,
> I am performing etl and loading Parquet table through Hive .
> Ran invalidate metadata <db_name.table_name> in impala.
> While trying to run "compute stats <db_name.table_name>" or running "Select 
> count ( * ) from <db_name.table_name>" ,getting File has Invalid  Version 
> Number.
> *Full Error :*
> ERROR: File '<full_hdfs_path>' has an invalid version number: <Random_number>
>  This could be due to stale metadata. Try running "refresh 
> <db_name.table_name>".
>  
> !image-2020-08-03-10-31-37-405.png!
>  
> Tried running "refresh <db_name.table_name>"  multiple times ,still  "Select 
> count ( * )" or "Compute stats " not working.
>  I am able to run "Select count ( * )  from <db_name.table_name> " in hive 
> beeline successfully ,but in impala it is not working.
>  
> Seems like this issue is fixed in 
> https://issues.apache.org/jira/browse/IMPALA-2477  ( impala 2.3.0 ). 
>  But still we are facing this issue in *version 3.2.0-cdh6.2.1*
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to