[ 
https://issues.apache.org/jira/browse/MAHOUT-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Schelter resolved MAHOUT-1487.
----------------------------------------

    Resolution: Won't Fix

no activity in four weeks

> More understandable error message when attempt to use wrong FileSystem
> ----------------------------------------------------------------------
>
>                 Key: MAHOUT-1487
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1487
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Clustering
>    Affects Versions: 0.9
>         Environment: Amazon S3, Amazon EMR, Local file system
>            Reporter: Konstantin
>            Priority: Trivial
>             Fix For: 1.0
>
>
> RandomSeedGenerator has following code:
> FileSystem fs = FileSystem.get(output.toUri(), conf);
> ...
> fs.getFileStatus(input).isDir() 
> If specify output path correctly and input path not correctly, Mahout throws 
> not well understandable error message. "Exception in thread "main" 
> java.lang.IllegalArgumentException: This file system object 
> (hdfs://172.31.41.65:9000) does not support access to the request path 
> 's3://by.kslisenko.bigdata/stackovweflow-small/out_new/sparse/tfidf-vectors' 
> You possibly called FileSystem.get(conf) when you should have called 
> FileSystem.get(uri, conf) to obtain a file system supporting your path"
> This happens because FileSystem object was created from output path, and 
> getFileStatus has parameter for input path. This caused misunderstanding when 
> try to understand what error message means.
> To prevent this misunderstanding, I propose to improve error message adding 
> following details:
> 1. Specify which filesystem type used (DistributedFileSystem, 
> NativeS3FileSystem, etc. using fs.getClass().getName())
> 2. Then specify which path can not be processed correctly.
> This can be done by validation utility which can be applied to many places in 
> Mahout. When we use Mahout we need to specify many paths and we also can use 
> many types of file systems: local for debugging, distributed on Hadoop, and 
> s3 on Amazon. In this case better error messages can save much time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to