[ 
https://issues.apache.org/jira/browse/CRUNCH-429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261560#comment-14261560
 ] 

Micah Whitacre commented on CRUNCH-429:
---------------------------------------

[~unluckyboy], interesting I don't typically use s3.  My suggestion was to cut 
down on retrieving the FileSystem object because typically for a Source it 
would not change.  In your s3 use case do you typically interact with multiple 
instances that you would need to vary config with each path?  Or do you mix 
reading CSV files from HDFS and s3 inside a single Source?  The reason I ask is 
that you should still be able to use the current CSVFileSource by configuring 
the connection information for s3 using the Source's inputConf(...) methods[1].

If that is prohibitive feel free to open up another issue and we can enhance 
the Source code.

[1] - 
http://crunch.apache.org/apidocs/0.8.4/org/apache/crunch/Source.html#inputConf(java.lang.String,
 java.lang.String)

> The CSVFileSource does not always function properly
> ---------------------------------------------------
>
>                 Key: CRUNCH-429
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-429
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.3
>            Reporter: mac champion
>            Assignee: mac champion
>            Priority: Minor
>              Labels: csv, csvparser
>             Fix For: 0.8.4, 0.11.0
>
>         Attachments: 0001-CRUNCH-429-Fix-CSVInputFormat.patch, 
> CRUNCH-429_a.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> The "configure" method of CSVInputFormat does not have any effect on its 
> configuration and is never called. Instead, the class needs to implement 
> Configurable and set its configuration options in an overriden setConf 
> method.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to