[
https://issues.apache.org/jira/browse/MAHOUT-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Schelter resolved MAHOUT-1487.
----------------------------------------
Resolution: Won't Fix
no activity in four weeks
> More understandable error message when attempt to use wrong FileSystem
> ----------------------------------------------------------------------
>
> Key: MAHOUT-1487
> URL: https://issues.apache.org/jira/browse/MAHOUT-1487
> Project: Mahout
> Issue Type: Improvement
> Components: Clustering
> Affects Versions: 0.9
> Environment: Amazon S3, Amazon EMR, Local file system
> Reporter: Konstantin
> Priority: Trivial
> Fix For: 1.0
>
>
> RandomSeedGenerator has following code:
> FileSystem fs = FileSystem.get(output.toUri(), conf);
> ...
> fs.getFileStatus(input).isDir()
> If specify output path correctly and input path not correctly, Mahout throws
> not well understandable error message. "Exception in thread "main"
> java.lang.IllegalArgumentException: This file system object
> (hdfs://172.31.41.65:9000) does not support access to the request path
> 's3://by.kslisenko.bigdata/stackovweflow-small/out_new/sparse/tfidf-vectors'
> You possibly called FileSystem.get(conf) when you should have called
> FileSystem.get(uri, conf) to obtain a file system supporting your path"
> This happens because FileSystem object was created from output path, and
> getFileStatus has parameter for input path. This caused misunderstanding when
> try to understand what error message means.
> To prevent this misunderstanding, I propose to improve error message adding
> following details:
> 1. Specify which filesystem type used (DistributedFileSystem,
> NativeS3FileSystem, etc. using fs.getClass().getName())
> 2. Then specify which path can not be processed correctly.
> This can be done by validation utility which can be applied to many places in
> Mahout. When we use Mahout we need to specify many paths and we also can use
> many types of file systems: local for debugging, distributed on Hadoop, and
> s3 on Amazon. In this case better error messages can save much time.
--
This message was sent by Atlassian JIRA
(v6.2#6252)