[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888661#comment-13888661
 ] 

Eric Newton commented on ACCUMULO-118:
--------------------------------------

bq.  I think this feature was merged in before it was complete

Probably.  But it was a pretty massive change, and maintaining it as a patch 
set, even with git's help, would have been very hard.

bq. I did not realize all of the problems absolute paths could cause

Nor would we have if it was not merged in.

bq. should have started with administrative use cases

I think we are getting better at this.  For example, I can think of lots of 
ways that the initial WAL implementation caused a lot of grief for unsuspecting 
administrators.  We fixed this after it was released into the wild based on 
feedback from the administrators. Ultimately these were fixed by moving the WAL 
to HDFS, and then ferreting out all the settings to make HDFS an appropriate 
store for the WAL.

I think the use case of "what if administrators change the URL of a NN?" is a 
reasonable one, but was certainly not anything I was thinking about when I was 
changing thousands of lines of code to use full paths.  The more subtle issues 
of determining aliases for namespaces (hdfs://example:9000 vs 
hdfs://example.com:9000), and recognizing real namespaces under viewfs are the 
sort of subtle things that we will only find through actual use.

My initial goal of using concrete paths to simplify debugging might have been 
the wrong choice.  Using some kind of indirect configuration that points to a 
real namespace (like viewfs) may have been better.  But, that requires that you 
value "administrators should be able to easily move a NN to a new URL."  The 
ability to do this with the old relative paths was not a design goal, so much 
as a useful result of using the shortest name possible for each file.

bq. These really seem to be the long poll in the tent for the 1.6 release 

Seems to me to not be so far behind namespaces. Constructive criticism includes 
suggestions on how to make things better.  Working code is even more 
constructive.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> ----------------------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-118
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-118
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: master, tserver
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>            Priority: Blocker
>             Fix For: 1.6.0
>
>         Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to