[
https://issues.apache.org/jira/browse/CRUNCH-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686899#comment-13686899
]
Deepak Subhramanian commented on CRUNCH-220:
--------------------------------------------
[~joshwills] cc [~davebeech]
Hi Josh, When I pass the fs.default.name with 0.6 version it works fine locally
but not in the cluster since it is looking job.jar in the wrong place.
I tried to get the latest code from master from git and compile it. For some
reason it is giving error and I cannot find the test classes in the project
repository.
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile
(default-testCompile) on project crunch-core: Compilation failure
[ERROR]
/Users/deepak/github/crunch/crunch-core/src/it/java/org/apache/crunch/io/avro/AvroMemPipelineIT.java:[74,11]
cannot find symbol
[ERROR] symbol : constructor
Person(java.lang.String,int,java.util.List<java.lang.CharSequence>)
[ERROR] location: class org.apache.crunch.test.Person
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please
read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :crunch-core
> Crunch not working with S3
> --------------------------
>
> Key: CRUNCH-220
> URL: https://issues.apache.org/jira/browse/CRUNCH-220
> Project: Crunch
> Issue Type: Bug
> Components: IO
> Affects Versions: 0.6.0
> Environment: Cloudera Hadoop with Amazon S3
> Reporter: Deepak Subhramanian
> Assignee: Josh Wills
> Priority: Minor
> Fix For: 0.7.0
>
> Attachments: CRUNCH-220.patch
>
>
> I am trying to use crunch to read file from S3 and write to S3. I am able to
> read the file .But giving an error while writing to s3. Not sure if it is a
> bug or I am missing a hadoop configuration. I am able to read from s3 and
> write to a local file or hdfs directly. Here is the code and error. I am
> passing s3 key and secret as parameters.
> PCollection<String> lines =pipeline.read(From.sequenceFile(inputdir,
> Writables.strings()));
>
> PCollection<String> textline = lines.parallelDo(new DoFn<String,
> String>() {
> public void process(String line, Emitter<String> emitter) {
> if (headerNotWritten) {
>
> //emitter.emit("Writing Header");
> emitter.emit(table_header.getTable_header());
> emitter.emit(line);
> headerNotWritten =false;
>
> }else {
> emitter.emit(line);
> }
> }
> }, Writables.strings()); // Indicates the serialization format
>
> pipeline.writeTextFile(textline, outputdir);
> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS:
> s3n://bktname/testcsv, expected: hdfs://ip-address.compute.internal
> [ip-addresscompute.amazonaws.com] out: at
> org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:410)
> [ip-address-82.eu-west-1.compute.amazonaws.com] out: at
> org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:106)
> [ip-address-82.eu-west-1.compute.amazonaws.com] out: at
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:162)
> [ip-address-82.eu-west-1.compute.amazonaws.com] out: at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:558)
> [ip-address-82.eu-west-1.compute.amazonaws.com] out: at
> org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:797)
> [ip-address-82.eu-west-1.compute.amazonaws.com] out: at
> org.apache.crunch.io.impl.FileTargetImpl.handleExisting(FileTargetImpl.java:133)
> [ip-address-82.eu-west-1.compute.amazonaws.com] out: at
> org.apache.crunch.impl.mr.MRPipeline.write(MRPipeline.java:212)
> [ip-address-82.eu-west-1.compute.amazonaws.com] out: at
> org.apache.crunch.impl.mr.MRPipeline.write(MRPipeline.java:200)
> [ip-address-82.eu-west-1.compute.amazonaws.com] out: at
> org.apache.crunch.impl.mr.collect.PCollectionImpl.write(PCollectionImpl.java:132)
> [ec2-79-125-102-82.eu-west-1.compute.amazonaws.com] out: at
> org.apache.crunch.impl.mr.MRPipeline.writeTextFile(MRPipeline.java:356)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira