Ah, okay-- yeah, I could see where it would be a problem if it's not extending FileInputFormat.
On Thu, May 24, 2018 at 12:52 PM, Suyash Agarwal <[email protected]> wrote: > Hi, > > On trying something like this: > From.formattedFile(inputPath, (Class<? extends FileInputFormat<?, ?>>) > FourMcTextInputFormat.class, Writables.strings(), Writables.strings()); > where com.hadoop.mapreduce.FourMcTextInputFormat extends > com.hadoop.mapreduce.FourMcInputFormat (which extends org.apache.hadoop. > mapreduce.lib.input.FileInputFormat), > I get the compile time error: > java: incompatible types: > java.lang.Class<com.hadoop.mapreduce.FourMcTextInputFormat> > cannot be converted to java.lang.Class<? extends > org.apache.hadoop.mapreduce.lib.input.FileInputFormat<?,?>> > > > On Mon, May 21, 2018 at 10:25 PM, Josh Wills <[email protected]> wrote: > >> That looks like the right solution to me, though I wouldn't mind seeing >> the stack trace for the ClassCastException for the formattedFile() call if >> you have it handy! >> >> On Mon, May 21, 2018 at 1:49 AM, Suyash Agarwal <[email protected]> >> wrote: >> >>> Hi Gabriel, >>> >>> Ya, I was able to run mapreduce with 4mc compressed input. >>> I had to use a different input format class: https://github.com/carl >>> omedas/4mc/blob/master/java/hadoop-4mc/src/main/java/com/had >>> oop/mapreduce/FourMcTextInputFormat.java >>> >>> I am able to make it work in crunch by creating a different source >>> implementation. >>> >>> public class FourMCInputSource<T> extends FileSourceImpl<T> implements >>> ReadableSource<T> { >>> public FourMCInputSource(Path path, PType<T> ptype) { >>> super(path, ptype, FourMcTextInputFormat.class); >>> } >>> } >>> >>> Not sure if this is the right way. >>> >>> Thanks. >>> >>> >>> On Fri, May 18, 2018 at 8:04 PM, Gabriel Reid <[email protected]> >>> wrote: >>> >>>> Hi Suyash, >>>> >>>> Could you post a bit more of your stack trace and information about >>>> which Hadoop version you're running on? >>>> >>>> Also, have you tried running a simple MapReduce job (e.g. word count) >>>> that operates on this file to ensure that 4mc-compression is working >>>> correctly on your cluster? >>>> >>>> - Gabriel >>>> >>>> >>>> On Wed, May 16, 2018 at 1:55 PM, Suyash Agarwal <[email protected]> >>>> wrote: >>>> > Hi, >>>> > >>>> > `new TextFileSource<>(<4mc compressed input>, strings())` fails with >>>> the >>>> > error: >>>> > java.lang.NullPointerException: null >>>> > at >>>> > com.hadoop.compression.fourmc.Lz4Decompressor.reset(Lz4Decom >>>> pressor.java:234). >>>> > >>>> > And trying `(Class<? extends FileInputFormat<?, ?>>) >>>> > FourMcTextInputFormat.class` in From.formattedFile() as the format >>>> class >>>> > doesn't work with class cast exception. >>>> > >>>> > So, how can I read the 4mc compressed input file in Crunch? >>>> > >>>> > Thanks. >>>> > >>>> >>> >>> >> >
