[
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888661#comment-13888661
]
Eric Newton commented on ACCUMULO-118:
--------------------------------------
bq. I think this feature was merged in before it was complete
Probably. But it was a pretty massive change, and maintaining it as a patch
set, even with git's help, would have been very hard.
bq. I did not realize all of the problems absolute paths could cause
Nor would we have if it was not merged in.
bq. should have started with administrative use cases
I think we are getting better at this. For example, I can think of lots of
ways that the initial WAL implementation caused a lot of grief for unsuspecting
administrators. We fixed this after it was released into the wild based on
feedback from the administrators. Ultimately these were fixed by moving the WAL
to HDFS, and then ferreting out all the settings to make HDFS an appropriate
store for the WAL.
I think the use case of "what if administrators change the URL of a NN?" is a
reasonable one, but was certainly not anything I was thinking about when I was
changing thousands of lines of code to use full paths. The more subtle issues
of determining aliases for namespaces (hdfs://example:9000 vs
hdfs://example.com:9000), and recognizing real namespaces under viewfs are the
sort of subtle things that we will only find through actual use.
My initial goal of using concrete paths to simplify debugging might have been
the wrong choice. Using some kind of indirect configuration that points to a
real namespace (like viewfs) may have been better. But, that requires that you
value "administrators should be able to easily move a NN to a new URL." The
ability to do this with the old relative paths was not a design goal, so much
as a useful result of using the shortest name possible for each file.
bq. These really seem to be the long poll in the tent for the 1.6 release
Seems to me to not be so far behind namespaces. Constructive criticism includes
suggestions on how to make things better. Working code is even more
constructive.
> accumulo could work across HDFS instances, which would help it to scale past
> a single namenode
> ----------------------------------------------------------------------------------------------
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
> Issue Type: Improvement
> Components: master, tserver
> Reporter: Eric Newton
> Assignee: Eric Newton
> Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
> Original Estimate: 2,016h
> Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)