I'm recently run drill to explore data from hadoop cluster. But, I have problem with when running the query againts our data source. The query always failed due to malformed json data. Because, the data itself pretty raw, It maybe contains some malformed json format in and there. Its difficult to do cleaning itself due to the data size (around hundreds of gigs text file).
My question is, Is there anyway to exclude/skip the malformed records, and make it into separate result, just like the spark/shark do? How to solve this problem elegantly? Thank you. Regards, Fritz
