A makeQualified call should help in case the file is not found:

LocalFileSystem localFs = FileSystem.getLocal(conf);
Path localCacheFile = localFs.makeQualified(localFiles[0]);

if you run in local mode (e.g. not on a cluster), you could have to use
a fallback to directly load the file, as it is done in
org.apache.mahout.cf.taste.hadoop.als.ALS#readMatrixByRowsFromDistributedCache

Best,
Sebastian

On 09.06.2013 17:48, Grant Ingersoll (JIRA) wrote:
> 
>     [ 
> https://issues.apache.org/jira/browse/MAHOUT-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13679090#comment-13679090
>  ] 
> 
> Grant Ingersoll edited comment on MAHOUT-1247 at 6/9/13 3:47 PM:
> -----------------------------------------------------------------
> 
> I think I see the issue.  The cache file is "local", the Iterator, however, 
> has a Hadoop conf that is expecting an HDFS file, hence it can't find it.
> 
> For instance, the logs show:
> {quote}11:38:49,638 INFO 
> org.apache.mahout.vectorizer.term.TFPartialVectorReducer: Cache Files: 
> [/tmp/hadoop-grantingersoll/mapred/local/taskTracker/distcache/2677051046998143225_1262960862_697707077/localhostdicVec/dictionary.file-0]
> 2013{quote}
> 
> Notice it is missing the scheme.  Going to try explicitly setting the scheme 
> to file://
>                 
>       was (Author: gsingers):
>     I think I see the issue.  The cache file is "local", the Iterator, 
> however, has a Hadoop conf that is expecting an HDFS file, hence it can't 
> find it.
>                   
>> cluster-reuters doesn't work on Hadoop
>> --------------------------------------
>>
>>                 Key: MAHOUT-1247
>>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1247
>>             Project: Mahout
>>          Issue Type: Bug
>>            Reporter: Grant Ingersoll
>>            Assignee: Grant Ingersoll
>>             Fix For: 0.8
>>
>>
>> At least two issues:
>> 1. MAHOUT-992 messed up the Distributed Cache stuff somehow
>> 2. The ExtractReuters data is not being moved to HDFS.
> 
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
> 

Reply via email to