[
https://issues.apache.org/jira/browse/OOZIE-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993827#comment-13993827
]
Ben Roling commented on OOZIE-1829:
-----------------------------------
It appears the service also doesn't fully support URI schemes where there is no
authority. For example, you might have a Kite URI like this:
repo:hive?dataset-name=Person&partition-key=[201405091300]
When getAuthorityWithScheme() is called it would return "/" since it fails to
find an authority in the URI. It looks to me like this will cause Oozie to
fall back on the default URIHandler. It seems the usage of the
getAuthorityWithScheme() method is currently limited to enabling a
determination of PUSH vs PULL from CoordCommandUtils.materializeDataEvents()
and as such I would expect the impact would jut be that the PULL model would
always be used even though the Kite URIHandler might wish to specify a PUSH
model.
> URIHandlerService doesn't support URI schemes with query strings but no path
> segment
> ------------------------------------------------------------------------------------
>
> Key: OOZIE-1829
> URL: https://issues.apache.org/jira/browse/OOZIE-1829
> Project: Oozie
> Issue Type: Bug
> Components: core
> Affects Versions: 4.0.1
> Reporter: Ben Roling
>
> While working on a prototype of integration between Oozie and the Kite SDK
> (see https://issues.cloudera.org/browse/CDK-385), I came to find that
> URIHandlerService.getAuthorityWithSchema(String uri) doesn't support URI
> schemes where there is a query string, but no path segment.
> I am currently prototyping Kite Dataset URIs and in my prototype, a Dataset
> URI for a dataset in a Hive/HCatalog DatasetRepository with managed
> Hive/HCatalog tables could look like this:
> repo:hive://localhost:9043?dataset-name=Person&partition-key=\[201405091300\]
> I am attempting to create an Oozie dataset around this Kite dataset and to
> make that happen I have implemented Oozie's URIHandler API for the Kite
> "repo" URI scheme. When I attempted to run my first coordinator, it failed.
> The coordinator has the following dataset definition:
> {code}
> <dataset name="Person" frequency="${coord:minutes(5)}"
> initial-instance="2014-04-24T00:00Z" timezone="UTC">
>
> <uri-template>repo:hive://localhost:9083?dataset-name=Person&partition-key=[${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}]
> </uri-template>
> </dataset>
> {code}
> This dataset is used as an output of the coordinator.
> When the coordinator is submitted it fails with the following exception:
> {code}
> 2014-05-09 10:57:34,991 ERROR
> org.apache.oozie.command.coord.CoordMaterializeTransitionXCommand:
> SERVER[localhost.localdomain] USER[cloudera] GROUP[-] TOKEN[] APP[Person-c]
> JOB[0000013-140508121805317-oozie-oozi-C] ACTION[-] Exception occurred:E0906:
> URI parsing error :
> repo:hive://localhost:9083?dataset-name=Person&partition-key=[${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}]
> Making the job failed
> org.apache.oozie.dependency.URIHandlerException: E0906: URI parsing error :
> repo:hive://localhost:9083?dataset-name=Person&partition-key=[${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}]
> at
> org.apache.oozie.service.URIHandlerService.getAuthorityWithScheme(URIHandlerService.java:216)
> at
> org.apache.oozie.command.coord.CoordCommandUtils.materializeDataEvents(CoordCommandUtils.java:582)
> at
> org.apache.oozie.command.coord.CoordCommandUtils.materializeOneInstance(CoordCommandUtils.java:451)
> at
> org.apache.oozie.command.coord.CoordMaterializeTransitionXCommand.materializeActions(CoordMaterializeTransitionXCommand.java:386)
> at
> org.apache.oozie.command.coord.CoordMaterializeTransitionXCommand.materialize(CoordMaterializeTransitionXCommand.java:267)
> at
> org.apache.oozie.command.MaterializeTransitionXCommand.execute(MaterializeTransitionXCommand.java:72)
> at
> org.apache.oozie.command.MaterializeTransitionXCommand.execute(MaterializeTransitionXCommand.java:28)
> at org.apache.oozie.command.XCommand.call(XCommand.java:280)
> at
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.net.URISyntaxException: Illegal character in opaque part at
> index 63:
> repo:hive://localhost:9083?dataset-name=Person&partition-key=[${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}]
> at java.net.URI$Parser.fail(URI.java:2829)
> at java.net.URI$Parser.checkChars(URI.java:3002)
> at java.net.URI$Parser.parse(URI.java:3039)
> at java.net.URI.<init>(URI.java:595)
> at
> org.apache.oozie.service.URIHandlerService.getAuthorityWithScheme(URIHandlerService.java:209)
> ... 11 more
> {code}
> The problem is that URIHandlerService.getAuthorityWithScheme(String uri)
> doesn't consider the possibility that the URI might have a query string and
> no path segment. As a result, it ends up trying to create a URI from the
> entire URI template, which blows up due to $ in the template parameters.
--
This message was sent by Atlassian JIRA
(v6.2#6252)