[
https://issues.apache.org/jira/browse/SENTRY-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sravya Tirukkovalur updated SENTRY-1295:
----------------------------------------
Description:
Paths in HMS are expected to be in one of these forms:
* hdfs://hostname:port/path
* hdfs:///path
* /path, in which case, scheme will be constructed from FileSystem.getDefaultURI
* URIs with non hdfs scheme will just be ignored
I came across atleast 2 sentry users where HMS did have paths which do not
comply with above rules and hence HMS plugin initialization for pathupdates
failed. See sentry-1260 and sentry-1270 for details on how these errors
surface.
With 1260 and 1270 we should have more information on what these malformed
paths were. But we should continue to investigate and fix the root cause, It
would most likely be in HMS code base. Until then, here is how you can diagnose
and fix it manually:
*Look for malformed paths in HMS* : Look in DBS as well as SDS tables.
{code}
SELECT "NAME", "DB_LOCATION_URI" FROM "DBS" WHERE NOT "DB_LOCATION_URI" LIKE
'hdfs://%/%';
NAME | DB_LOCATION_URI
-----------+--------------------
db_name | hdfs://nameservice1
(1 row)
{code}
*Fix it manually updating the HMS location*
{code}
UPDATE DBS
SET DB_LOCATION_URI='hdfs://nameservice1/user/hive/warehouse/db_name.db'
WHERE DB_ID=12345;
{code}
Lets track occurrences of these malformed paths here:
* hdfs://nameservice1 : Not sure why would any one create a db/table in root
directory? Should we accept this in Sentry?
What does SKEWED_COL_VALUE_LOC_MAP.location in HMS correspond to? Double check
if there are any malformed paths here?
was:
Paths in HMS are expected to be in one of these forms:
* hdfs://hostname:port/path
* hdfs:///path
* /path, in which case, scheme will be constructed from FileSystem.getDefaultURI
* URIs with non hdfs scheme will just be ignored
I came across atleast 2 sentry users where HMS did have paths which do not
comply with above rules and hence HMS plugin initialization for pathupdates
failed. See sentry-1260 and sentry-1270 for details on how these errors
surface.
With 1260 and 1270 we should have more information on what these malformed
paths were. But we should continue to investigate and fix the root cause, It
would most likely be in HMS code base. Until then, here is how you can diagnose
and fix it manually:
*Look for malformed paths in HMS* : Look in DBS as well as SDS tables.
{code}
SELECT "NAME", "DB_LOCATION_URI" FROM "DBS" WHERE NOT "DB_LOCATION_URI" LIKE
'hdfs://%/%';
NAME | DB_LOCATION_URI
-----------+--------------------
db_name | hdfs://nameservice1
(1 row)
{code}
*Fix it manually updating the HMS location*
{code}
UPDATE DBS
SET DB_LOCATION_URI='hdfs://nameservice1/user/hive/warehouse/db_name.db'
WHERE DB_ID=12345;
{code}
Lets track occurrences of these malformed paths here:
* hdfs://nameservice1 : Not sure why would any one create a db/table in root
directory? Should we accept this in Sentry?
What does SKEWED_COL_VALUE_LOC_MAP.location correspond to? Double check if
there are any malformed paths here?
> Investigate malformed paths in HMS db
> -------------------------------------
>
> Key: SENTRY-1295
> URL: https://issues.apache.org/jira/browse/SENTRY-1295
> Project: Sentry
> Issue Type: Bug
> Reporter: Sravya Tirukkovalur
>
> Paths in HMS are expected to be in one of these forms:
> * hdfs://hostname:port/path
> * hdfs:///path
> * /path, in which case, scheme will be constructed from
> FileSystem.getDefaultURI
> * URIs with non hdfs scheme will just be ignored
> I came across atleast 2 sentry users where HMS did have paths which do not
> comply with above rules and hence HMS plugin initialization for pathupdates
> failed. See sentry-1260 and sentry-1270 for details on how these errors
> surface.
> With 1260 and 1270 we should have more information on what these malformed
> paths were. But we should continue to investigate and fix the root cause, It
> would most likely be in HMS code base. Until then, here is how you can
> diagnose and fix it manually:
> *Look for malformed paths in HMS* : Look in DBS as well as SDS tables.
> {code}
> SELECT "NAME", "DB_LOCATION_URI" FROM "DBS" WHERE NOT "DB_LOCATION_URI" LIKE
> 'hdfs://%/%';
> NAME | DB_LOCATION_URI
> -----------+--------------------
> db_name | hdfs://nameservice1
> (1 row)
> {code}
> *Fix it manually updating the HMS location*
> {code}
> UPDATE DBS
> SET DB_LOCATION_URI='hdfs://nameservice1/user/hive/warehouse/db_name.db'
> WHERE DB_ID=12345;
> {code}
> Lets track occurrences of these malformed paths here:
> * hdfs://nameservice1 : Not sure why would any one create a db/table in root
> directory? Should we accept this in Sentry?
> What does SKEWED_COL_VALUE_LOC_MAP.location in HMS correspond to? Double
> check if there are any malformed paths here?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)