dear sir, problem 1: for files to concurrent read ? Hive0.14 file is read directly from the HDFS.The following is the record of the log:
15/02/26 16:43:31 [main]: INFO orc.ReaderImpl: Reading ORC rows from hdfs://spark-jrdata-12.pekdc1.jdfin.local:9000/user/hive/warehouse/sku_01/end_dt=20150111/000000_0 with {include: [true, true, true, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false], offset: 0, length: 9223372036854775807} Here I have a question. To hive0.13, through the MR to read the file. If the data quantity is big, the faster the execution rate. But in hive0.14, It Is how to take concurrent reads the file, so as to improve the query speed. Here I know hive0.14, through the package data structure, to your query need column only get this column instead of the whole line. I hope you tell me detail implementation class . problem 2: to run merge the data of detail implementation class . I hope to answer. Thank you .