[jira] [Commented] (ACCUMULO-2234) Cannot run offline mapreduce over non-default instance.dfs.dir value

Josh Elser (JIRA) Thu, 23 Jan 2014 13:50:47 -0800

    [ 
https://issues.apache.org/jira/browse/ACCUMULO-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13880403#comment-13880403
 ]


Josh Elser commented on ACCUMULO-2234:
--------------------------------------

bq. Implementation should not add a dependency on server configuration files 
which cannot assumed to be known by the launching process. It should use 
conn.instanceOperations().getSiteConfiguration() to get the configuration via 
thrift, without additional classpath dependencies on server configuration files.

The *can* be assumed to be known by the launching process as ACCUMULO_CONF_DIR 
is expected to be set for other continuous ingest tests. Can instance.dfs.dir 
be pulled from a Connector/Instance without having it specified by the 
accumulo-site.xml? If so, I was unaware of this.

However, IMO, there is still no downside to this given that this is a system 
test and expectations on accumulo-site.xml already being present. This works as 
is -- I would be inclined that you should open a different ticket for this as 
these changes do successfully satisfy the lack of functionality in such a way 
that I do not see issue with.

bq. Also, is this really a blocker?

I could not run a required test for released as I should have been able to. So, 
yes this is a blocker to me. If we say you should be able to do something that 
affects a release, but it is impossible to do so, that's a blocker. If you 
don't agree, you have the ability to change the priority of this ticket.

> Cannot run offline mapreduce over non-default instance.dfs.dir value
> --------------------------------------------------------------------
>
>                 Key: ACCUMULO-2234
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2234
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.4.4, 1.5.0
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 1.4.5, 1.5.1, 1.6.0
>
>
> The javadoc for setting up offline scans over RFiles 
> (InputFormatBase.setScanOffline in 1.4 or InputFormatBase.setOfflineTableScan 
> in 1.5) includes a nice little comment to the effect that if a "non-standard" 
> directory is used for Accumulo in HDFS (read as, if the default value for 
> instance.dfs.dir), accumulo-site.xml may need to be on the classpath for the 
> mappers.
> Best as I can tell, even if accumulo-site.xml is on the classpath, it makes 
> no difference as InputFormatBase is creating a new ZooKeeperInstance which, 
> in turn, will only ever make a DefaultConfiguration and never try to check if 
> an accumulo-site.xml file is available. This would make it impossible for a 
> non-default value for instance.dfs.dir to ever be used.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (ACCUMULO-2234) Cannot run offline mapreduce over non-default instance.dfs.dir value

Reply via email to