[
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668603#comment-13668603
]
Keith Turner edited comment on ACCUMULO-118 at 5/28/13 8:21 PM:
----------------------------------------------------------------
I was looking at some docs on viewfs. If possible, I am thinking we should not
do anything that would preclude using viewfs. It seems like if URIs were
supported for tablet dirs and files (along with a way to choose a tablet dir)
that this would almost be enough to support viewfs.
{noformat}
1;m srv:dir viewfs://clusterX/accumulo1/tables/abc
1;m file:viewfs://clusterX/accumulo1/tables/abc/F0000002.rf [] 196,1
1< srv:dir viewfs://clusterX/accumulo2/tables/abc
1< file:viewfs://clusterX/accumulo2/tables/abc/F0000003.rf [] 196,1
{noformat}
If we want to further develop our own indirection layer, then maybe we should
define our own URI prefix. Something like ans://. How independent should
this URI be? Something like ans://<namespace name>/<path> would assume that
you know where to look <namespace name> up. If the URI were like
ans://<zookeepers>+<instance id>+<namespace name>/<path> then it would be more
self contained. I do not think its necessary to make it self contained, its
for internal use and would be translated by as needed.
I was thinking about how bulk import will work in this federated world. Below
is one way this could work.
* Client calls import dir w/ /foo1
* Accumlo client code uses local config to convert /foo1 to URI hdfs://nn1/foo1
* hdfs://nn1/foo1 is passed to Accumulo server code via thrift
* Accumulo server code looks at URI to determine where to move to, determines
it has accumulo dir hdfs://nn1/accumulo.
* moves files in hdfs://nn1/foo1 to hdfs://nn1/accumulo/tables/abc
* Replaces hdfs://nn1/accumulo/tables/abc with ans://ns1/accumulo/tables/abc
* Does bulk import of files in ans://ns1/accumulo/tables/abc
Is this how this should work? The scenario above implies that Accumulo needs a
dir on each namenode and way of mapping URIs to the appropriate Accumulo dir.
Need to wor through this scenario w/ viewfs also.
was (Author: kturner):
I was looking at some docs on viewfs. If possible, I am thinking we should
not do anything that would preclude using viewfs. It seems like if URIs were
supported for tablet dirs and files (along with a way to choose a tablet dir)
that this would almost be enough to support viewfs.
{noformat}
1;m srv:dir viewfs://clusterX/accumulo1/tables/abc
1;m file:viewfs://ns1/accumulo1/tables/abc/F0000002.rf [] 196,1
1< srv:dir viewfs://clusterX/accumulo2/tables/abc
1< file:viewfs://ns1/accumulo2/tables/abc/F0000003.rf [] 196,1
{noformat}
If we want to further develop our own indirection layer, then maybe we should
define our own URI prefix. Something like ans://. How independent should
this URI be? Something like ans://<namespace name>/<path> would assume that
you know where to look <namespace name> up. If the URI were like
ans://<zookeepers>+<instance id>+<namespace name>/<path> then it would be more
self contained. I do not think its necessary to make it self contained, its
for internal use and would be translated by as needed.
I was thinking about how bulk import will work in this federated world. Below
is one way this could work.
* Client calls import dir w/ /foo1
* Accumlo client code uses local config to convert /foo1 to URI hdfs://nn1/foo1
* hdfs://nn1/foo1 is passed to Accumulo server code via thrift
* Accumulo server code looks at URI to determine where to move to, determines
it has accumulo dir hdfs://nn1/accumulo.
* moves files in hdfs://nn1/foo1 to hdfs://nn1/accumulo/tables/abc
* Replaces hdfs://nn1/accumulo/tables/abc with ans://ns1/accumulo/tables/abc
* Does bulk import of files in ans://ns1/accumulo/tables/abc
Is this how this should work? The scenario above implies that Accumulo needs a
dir on each namenode and way of mapping URIs to the appropriate Accumulo dir.
Need to wor through this scenario w/ viewfs also.
> accumulo could work across HDFS instances, which would help it to scale past
> a single namenode
> ----------------------------------------------------------------------------------------------
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
> Issue Type: Improvement
> Components: master, tserver
> Reporter: Eric Newton
> Assignee: Eric Newton
> Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt
>
> Original Estimate: 2,016h
> Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira