Matthias Wies created IMPALA-11469:
--------------------------------------
Summary: Ignore _spark_metadata folder in table location
Key: IMPALA-11469
URL: https://issues.apache.org/jira/browse/IMPALA-11469
Project: IMPALA
Issue Type: Improvement
Components: Backend
Reporter: Matthias Wies
When spark streaming is used to write parquet files out to an external table a
folder _spark_metadata is created within the directory of the table. Hive is
capable of dealing with this directory, but Impala trips on it.
So REFRESH TABLE won't work as it sees a directory with data Impala cannot cope
with. A SELECT will also not work as it trips on the _spark_metadata __ folder
_._
Issue was found in CDP 7.1.7 SP1 but I suspect it is in all versions
Regards Matthias
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]