That looks like the right solution to me, though I wouldn't mind seeing the
stack trace for the ClassCastException for the formattedFile() call if you
have it handy!

On Mon, May 21, 2018 at 1:49 AM, Suyash Agarwal <[email protected]>
wrote:

> Hi Gabriel,
>
> Ya, I was able to run mapreduce with 4mc compressed input.
> I had to use a different input format class: https://github.com/carl
> omedas/4mc/blob/master/java/hadoop-4mc/src/main/java/com/had
> oop/mapreduce/FourMcTextInputFormat.java
>
> I am able to make it work in crunch by creating a different source
> implementation.
>
> public class FourMCInputSource<T> extends FileSourceImpl<T> implements
> ReadableSource<T> {
>   public FourMCInputSource(Path path, PType<T> ptype) {
>           super(path, ptype, FourMcTextInputFormat.class);
>   }
> }
>
> Not sure if this is the right way.
>
> Thanks.
>
>
> On Fri, May 18, 2018 at 8:04 PM, Gabriel Reid <[email protected]>
> wrote:
>
>> Hi Suyash,
>>
>> Could you post a bit more of your stack trace and information about
>> which Hadoop version you're running on?
>>
>> Also, have you tried running a simple MapReduce job (e.g. word count)
>> that operates on this file to ensure that 4mc-compression is working
>> correctly on your cluster?
>>
>> - Gabriel
>>
>>
>> On Wed, May 16, 2018 at 1:55 PM, Suyash Agarwal <[email protected]>
>> wrote:
>> > Hi,
>> >
>> > `new TextFileSource<>(<4mc compressed input>, strings())` fails with the
>> > error:
>> > java.lang.NullPointerException: null
>> > at
>> > com.hadoop.compression.fourmc.Lz4Decompressor.reset(Lz4Decom
>> pressor.java:234).
>> >
>> > And trying `(Class<? extends FileInputFormat<?, ?>>)
>> > FourMcTextInputFormat.class` in From.formattedFile() as the format class
>> > doesn't work with class cast exception.
>> >
>> > So, how can I read the 4mc compressed input file in Crunch?
>> >
>> > Thanks.
>> >
>>
>
>

Reply via email to