Since hive 13 (we're running hive12), you have the ability to use a vectorized execution engine, which processes, as against the normal execution engine, 1024 instead of 1 row at a time.
Seems like you need an extra Orc-Vectorized-Format to make use of it. On Wed, Jul 29, 2015 at 8:06 PM, David Rosenstrauch <[email protected]> wrote: > On 07/23/2015 12:01 PM, David Rosenstrauch wrote: > >> Just wondering what's the difference between these 2 classes. Is there >> a guideline as to when we should use one vs. the other? >> >> Thanks, >> >> DR >> > > Had a follow-up question along the same lines: > > What's VectorizedOrcInputFormat? > > > Also, a couple of other things I'm mulling over as we get a bit deeper > into our work with ORC: > > * In the docs it states "Seek to row number is implemented to support > secondary indexes". (See: > http://hive.apache.org/javadocs/r0.13.1/api/ql/org/apache/hadoop/hive/ql/io/orc/package-summary.html) > A colleague and I are working on this exact use case (secondary index). > And we were under the impression that we had to create our own row > numbering scheme to support the secondary index. Does ORC already write a > row number on each record? If so, how is that accessed? > > * We're thinking over how to structure our secondary index. And although > we can envision an ORC-based structure that would provide the functionality > we need, it'd be a bit clunky/complex/verbose to query using Hive. I was > thinking perhaps it might be an option for us to implement a layer in front > of ORC that hides some of the complexity of how the secondary index is > physically structured, and makes it possible to query it using simple HQL. > I know that Hive allows developers to use a custom InputFormat to implement > custom storage formats. So theoretically we could write a wrapper around > OrcNewInputFormat and/or OrcSerDe to provide the functionality we're > looking for. Any suggestions or pointers to someone looking to go this > route? (I.e., specific code we might look at? Where we might want to > insert our own code? Etc.) > > Thanks! > > DR >
