Ah, okay-- yeah, I could see where it would be a problem if it's not
extending FileInputFormat.

On Thu, May 24, 2018 at 12:52 PM, Suyash Agarwal <[email protected]>
wrote:

> Hi,
>
> On trying something like this:
> From.formattedFile(inputPath, (Class<? extends FileInputFormat<?, ?>>)
> FourMcTextInputFormat.class, Writables.strings(), Writables.strings());
> where com.hadoop.mapreduce.FourMcTextInputFormat extends
> com.hadoop.mapreduce.FourMcInputFormat (which extends org.apache.hadoop.
> mapreduce.lib.input.FileInputFormat),
> I get the compile time error:
> java: incompatible types: 
> java.lang.Class<com.hadoop.mapreduce.FourMcTextInputFormat>
> cannot be converted to java.lang.Class<? extends
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat<?,?>>
>
>
> On Mon, May 21, 2018 at 10:25 PM, Josh Wills <[email protected]> wrote:
>
>> That looks like the right solution to me, though I wouldn't mind seeing
>> the stack trace for the ClassCastException for the formattedFile() call if
>> you have it handy!
>>
>> On Mon, May 21, 2018 at 1:49 AM, Suyash Agarwal <[email protected]>
>> wrote:
>>
>>> Hi Gabriel,
>>>
>>> Ya, I was able to run mapreduce with 4mc compressed input.
>>> I had to use a different input format class: https://github.com/carl
>>> omedas/4mc/blob/master/java/hadoop-4mc/src/main/java/com/had
>>> oop/mapreduce/FourMcTextInputFormat.java
>>>
>>> I am able to make it work in crunch by creating a different source
>>> implementation.
>>>
>>> public class FourMCInputSource<T> extends FileSourceImpl<T> implements
>>> ReadableSource<T> {
>>>   public FourMCInputSource(Path path, PType<T> ptype) {
>>>           super(path, ptype, FourMcTextInputFormat.class);
>>>   }
>>> }
>>>
>>> Not sure if this is the right way.
>>>
>>> Thanks.
>>>
>>>
>>> On Fri, May 18, 2018 at 8:04 PM, Gabriel Reid <[email protected]>
>>> wrote:
>>>
>>>> Hi Suyash,
>>>>
>>>> Could you post a bit more of your stack trace and information about
>>>> which Hadoop version you're running on?
>>>>
>>>> Also, have you tried running a simple MapReduce job (e.g. word count)
>>>> that operates on this file to ensure that 4mc-compression is working
>>>> correctly on your cluster?
>>>>
>>>> - Gabriel
>>>>
>>>>
>>>> On Wed, May 16, 2018 at 1:55 PM, Suyash Agarwal <[email protected]>
>>>> wrote:
>>>> > Hi,
>>>> >
>>>> > `new TextFileSource<>(<4mc compressed input>, strings())` fails with
>>>> the
>>>> > error:
>>>> > java.lang.NullPointerException: null
>>>> > at
>>>> > com.hadoop.compression.fourmc.Lz4Decompressor.reset(Lz4Decom
>>>> pressor.java:234).
>>>> >
>>>> > And trying `(Class<? extends FileInputFormat<?, ?>>)
>>>> > FourMcTextInputFormat.class` in From.formattedFile() as the format
>>>> class
>>>> > doesn't work with class cast exception.
>>>> >
>>>> > So, how can I read the 4mc compressed input file in Crunch?
>>>> >
>>>> > Thanks.
>>>> >
>>>>
>>>
>>>
>>
>

Reply via email to