Re: Issue of Hive parquet partitioned table schema mismatch

Rex Xiong Tue, 03 Nov 2015 03:15:05 -0800

We found the query performance is very poor due to this issue
https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-11153
We usually use filter on partition key, the date, it's in string type in
1.3.1 and works great.
But in 1.5, it needs to do parquet scan for all partitions.
2015年10月31日 下午7:38，"Rex Xiong" <bycha...@gmail.com>写道：


> Add back this thread to email list, forgot to reply all.
> 2015年10月31日 下午7:23，"Michael Armbrust" <mich...@databricks.com>写道：
>
>> Not that I know of.
>>
>> On Sat, Oct 31, 2015 at 12:22 PM, Rex Xiong <bycha...@gmail.com> wrote:
>>
>>> Good to know that, will have a try.
>>> So there is no easy way to achieve it in pure hive method?
>>> 2015年10月31日 下午7:17，"Michael Armbrust" <mich...@databricks.com>写道：
>>>
>>>> Yeah, this was rewritten to be faster in Spark 1.5.  We use it with
>>>> 10,000s of partitions.
>>>>
>>>> On Sat, Oct 31, 2015 at 7:17 AM, Rex Xiong <bycha...@gmail.com> wrote:
>>>>
>>>>> 1.3.1
>>>>> It is a lot of improvement in 1.5+?
>>>>>
>>>>> 2015-10-30 19:23 GMT+08:00 Michael Armbrust <mich...@databricks.com>:
>>>>>
>>>>>> We have tried schema merging feature, but it's too slow, there're
>>>>>>> hundreds of partitions.
>>>>>>>
>>>>>> Which version of Spark?
>>>>>>
>>>>>
>>>>>
>>>>
>>

Re: Issue of Hive parquet partitioned table schema mismatch

Reply via email to