Not with any of today's APIs. "SELECT col1, col3 FROM t" is handled easily: you construct a schema that only has those columns, and col2 is skipped at read time.
Does Hive have a use case for this that you're interested in? If you don't mind paying the buffer copy, you could probably write a "DeferredFoo" class that doesn't de-serialize certain structures... -- Philip On Fri, Jan 22, 2010 at 6:20 PM, Zheng Shao <[email protected]> wrote: > I noticed that avro has the "skip" functions which can help skip a > field when deserializing data. > This is good for column pruning in most cases, but we might be able to > do better in the following case. > > > Let's say we have a query like this: > > CREATE TABLE t (col1 STRING, col2 STRING, col3 STRING); > SELECT col2 FROM t WHERE col3 = 'abcde'; > > We want to get field col3 first, if that matches what we want, then we > want to get to field col2. > > > Is there anyway to "remember" the current location of deserialization, > so that we can "resume" from that point? > > > -- > Yours, > Zheng >
