Yes, we did run into this issue too. Typically if the text hive table exceeds 100 million when converting txt table into ORC table.
On Fri, Dec 9, 2016 at 9:08 AM, Joaquin Alzola <joaquin.alz...@lebara.com> wrote: > HI List > > > > The transformation from textfile table to stored ORC table takes quiet a > long time. > > > > Steps follow> > > > > 1.Create one normal table using textFile format > > 2.Load the data normally into this table > > 3.Create one table with the schema of the expected results of your normal > hive table using stored as orcfile > > 4.Insert overwrite query to copy the data from textFile table to orcfile > table > > > > I have about 1,5 million records with about 550 fields in each row. > > > > Doing step 4 takes about 30 minutes (moving from one format to the other). > > > > I have spark with only one worker (same for HDFS) so running now a > standalone server but with 25G and 14 cores on that worker. > > > > BR > > > > Joaquin > This email is confidential and may be subject to privilege. If you are not > the intended recipient, please do not copy or disclose its content but > contact the sender immediately upon receipt. >