Need to figure out the output of every step in order to better understand the cube building process. Any way to decode the hadoop mapreduce output files?
On Tue, Mar 3, 2015 at 2:41 PM, Luke Han <[email protected]> wrote: > Kylin using dictionary to encode dimension values from String/Date to > digital value only, which will reduce storage significantly. > In query phase, when Kylin got result, it will decode and return > actually value to the client. > > Yang could have more detail comments for this. > > BTW, the intermedia files only be used by Kylin application, why you > need to decode it? > Please feel free to let's know if you have more questions. > > Thanks. > Luke > > > > > 2015-03-03 17:01 GMT+08:00 Luke Han <[email protected]>: > >> Forward to mailing list for further support. >> >> >> ---------- Forwarded message ---------- >> From: Abhishek Sinha <[email protected]> >> Date: 2015-02-22 20:20 GMT+08:00 >> Subject: Kylin code base help needed >> To: [email protected] >> >> >> Hey, >> I was looking at the Kylin code base(master) in order to understand the >> flow and output of each of the steps in cube building process. >> >> The first step which is "Create Intermediate hive table" can easily be >> understood as the table is being created in Hive. However, further down the >> line, "Build base cuboid" or the "N dimension cuboid" has its output being >> created in a "tmp" folder in HDFS. I tried opening the 'part-r-00000' but >> it seems that the output is encoded in some format(possibly byte array or >> something). >> >> Can you give me a little bit idea about the encoding technique that is >> being used, and possibly how to decode and get the intermediate outputs. >> >> >> >> >> Thanks and regards, >> >> >> >> Abhishek Sinha >> >> >
