[ https://issues.apache.org/jira/browse/SPARK-22528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-22528. ------------------------------- Resolution: Invalid Move to the mailing list for now; not obviously something due to Spark. > History service and non-HDFS filesystems > ---------------------------------------- > > Key: SPARK-22528 > URL: https://issues.apache.org/jira/browse/SPARK-22528 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.2.0 > Reporter: paul mackles > Priority: Minor > > We are using Azure Data Lake (ADL) to store our event logs. This worked fine > in 2.1.x but in 2.2.0, the event logs are no longer visible to the history > server. I tracked it down to the call to: > {code} > SparkHadoopUtil.get.checkAccessPermission() > {code} > which was added to "FSHistoryProvider" in 2.2.0. > I was able to workaround it by: > * setting the files on ADL to world readable > * setting HADOOP_PROXY to the Azure objectId of the service principal that > owns file > Neither of those workaround are particularly desirable in our environment. > That said, I am not sure how this should be addressed: > * Is this an issue with the Azure/Hadoop bindings not setting up the user > context correctly so that the "checkAccessPermission()" call succeeds w/out > having to use the username under which the process is running? > * Is this an issue with "checkAccessPermission()" not really accounting for > all of the possible FileSystem implementations? If so, I would imagine that > there are similar issues when using S3. > In spite of this check, I know the files are accessible through the > underlying FileSystem object so it feels like the latter but I don't think > that the FileSystem object alone could be used to implement this check. > Any thoughts [~jerryshao]? -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org