Hi Bobby, Can you open a jira and attach a patch? We can put that to contrib.
Zheng On 11/5/09, Bobby Rullo <[email protected]> wrote: > Andrey, > > Here you go: > > http://pastebin.com/m5724ce8a > > Bobby > On Nov 5, 2009, at 8:59 AM, Andrey Pankov wrote: > >> Thanks Bobby. Yeah, could be nice to take a look into your class, just >> to get familiar with. Could you please post at pastebin.com ? Thanks a >> lot! >> >> On Thu, Nov 5, 2009 at 18:56, Bobby Rullo <[email protected]> wrote: >>> I had the exact same question, and Zheng told me I had to implement >>> a new >>> FileInputFormat, so I extended SequenceFileInputFormat, and it >>> worked out >>> pretty well. >>> >>> If you like, I can post the source code somewhere (here?), but it >>> was pretty >>> easy. >>> >>> Bobby >>> On Nov 5, 2009, at 8:20 AM, Andrey Pankov wrote: >>> >>>> Hi guys, >>>> >>>> We have a lot of data stored inside compressed SEQ files. Since >>>> SEQ is >>>> a sequence of (key,value) pairs we are storing set of columns joined >>>> by tab in key part of SEQ, and the same for value part for another >>>> set >>>> of columns. So our SEQ files are of type (Text,Text). >>>> Hive cannot understand such files correctly, i.e. I'm not >>>> satisfied by >>>> its defaults. What it does - it ignores key part of SEQ, and value >>>> part can deserialize into set of columns successfully. >>>> Can some please point me how to get Hive not ignore SEQ's key? >>>> Thanks. >>>> >>>> -- >>>> Andrey Pankov >>> >>> >> >> >> >> -- >> Andrey Pankov > > -- Sent from Gmail for mobile | mobile.google.com Yours, Zheng
