Hi, Hive experts,
I need to query list/map of thrift struct. (I'm using Qubole)
1) The .thrift file is like below:
*enum DataType {*
* ET_ONE = 1,*
* ET_TWO = 2,*
* ET_THREE = 3,*
* ...*
*}*
*struct Result {*
* 1: optional i64 dataId,*
* // more fields, but it doesn't matter*
* 2: optional string fieldOne,*
* 3: optional string fieldTwo,*
* // ...*
*}*
*// in DB*
*struct Entry {*
* 1: string queryId,*
* 2: list<Result> resultList,*
* 3: map<i64, DataType> filterMap,*
* // ...*
*}*
2) The data in DB is like:
queryId: "asdf",
resultList: { {dataId: 1, fieldOne: "aa", fieldTwo: "bb", ...}, {dataId:
2, fieldOne: "aa", fieldTwo: "bb", ...}, ... }
filterMap: { {11: ET_ONE}, {13: ET_THREE}, {22: ET_TWO}, ... }
3)
I'd like to query Hive to get some result like:
queryId, string-format-of-resultList, string-format-of-filterMap
However,
* select queryId, resultList, filterMap from Entry*
will return some binary format of Result and concat-ing k-v format of
filterMap.
One workaround is using explode:
*SELECT queryId, ll.dataId, k, vFROM ad_query*
* LATERAL VIEW explode(resultList) ltable as ll LATERAL VIEW
explode(filterMap) kvtable as k, vWHERE size(filterMap) > 0*
But one row in original result will be returned as multiple rows after
using explode, I have to merge them by queryId afterwards. Like below
asdf, 1, 11, ET_ONE
asdf, 1, 13, ET_THREE
asdf, 1, 22, ET_TWO
asdf, 2, 11, ET_ONE
asdf, 2, 13, ET_THREE
asdf, 2, 22, ET_TWO
Any better native Hive way to return the human readable format for ARRAY of
struct and map?
Thanks for your time!