Just be aware that you should insert the data sorted at least on the most discrimating column of your where clause
> On 19 Jan 2016, at 17:27, Owen O'Malley <omal...@apache.org> wrote: > > It has both. Each index has statistics of min, max, count, and sum for each > column in the row group of 10,000 rows. It also has the location of the start > of each row group, so that the reader can jump straight to the beginning of > the row group. The reader takes a SearchArgument (eg. age > 100) that limits > which rows are required for the query and can avoid reading an entire file, > or at least sections of the file. > > .. Owen > >> On Tue, Jan 19, 2016 at 7:50 AM, Ashok Kumar <ashok34...@yahoo.com> wrote: >> Hi, >> >> I have read some notes on ORC files in Hive and indexes. >> >> The document describes in the indexes but makes reference to statistics >> >> Indexes >> >> >> >> >> >> >> >> >> Indexes >> Indexes ORC provides three level of indexes within each file: file level - >> statistics about the values in each column across the entire file >> View on orc.apache.org >> Preview by Yahoo >> >> >> I am confused as it is mixing up indexes with statistics. Can someone >> clarify these. >> >> Thanks >