Hi, On trying something like this: From.formattedFile(inputPath, (Class<? extends FileInputFormat<?, ?>>) FourMcTextInputFormat.class, Writables.strings(), Writables.strings()); where com.hadoop.mapreduce.FourMcTextInputFormat extends com.hadoop.mapreduce.FourMcInputFormat (which extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat), I get the compile time error: java: incompatible types: java.lang.Class<com.hadoop.mapreduce.FourMcTextInputFormat> cannot be converted to java.lang.Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<?,?>>
On Mon, May 21, 2018 at 10:25 PM, Josh Wills <[email protected]> wrote: > That looks like the right solution to me, though I wouldn't mind seeing > the stack trace for the ClassCastException for the formattedFile() call if > you have it handy! > > On Mon, May 21, 2018 at 1:49 AM, Suyash Agarwal <[email protected]> > wrote: > >> Hi Gabriel, >> >> Ya, I was able to run mapreduce with 4mc compressed input. >> I had to use a different input format class: https://github.com/carl >> omedas/4mc/blob/master/java/hadoop-4mc/src/main/java/com/had >> oop/mapreduce/FourMcTextInputFormat.java >> >> I am able to make it work in crunch by creating a different source >> implementation. >> >> public class FourMCInputSource<T> extends FileSourceImpl<T> implements >> ReadableSource<T> { >> public FourMCInputSource(Path path, PType<T> ptype) { >> super(path, ptype, FourMcTextInputFormat.class); >> } >> } >> >> Not sure if this is the right way. >> >> Thanks. >> >> >> On Fri, May 18, 2018 at 8:04 PM, Gabriel Reid <[email protected]> >> wrote: >> >>> Hi Suyash, >>> >>> Could you post a bit more of your stack trace and information about >>> which Hadoop version you're running on? >>> >>> Also, have you tried running a simple MapReduce job (e.g. word count) >>> that operates on this file to ensure that 4mc-compression is working >>> correctly on your cluster? >>> >>> - Gabriel >>> >>> >>> On Wed, May 16, 2018 at 1:55 PM, Suyash Agarwal <[email protected]> >>> wrote: >>> > Hi, >>> > >>> > `new TextFileSource<>(<4mc compressed input>, strings())` fails with >>> the >>> > error: >>> > java.lang.NullPointerException: null >>> > at >>> > com.hadoop.compression.fourmc.Lz4Decompressor.reset(Lz4Decom >>> pressor.java:234). >>> > >>> > And trying `(Class<? extends FileInputFormat<?, ?>>) >>> > FourMcTextInputFormat.class` in From.formattedFile() as the format >>> class >>> > doesn't work with class cast exception. >>> > >>> > So, how can I read the 4mc compressed input file in Crunch? >>> > >>> > Thanks. >>> > >>> >> >> >
