RDDs are immutable. Running .repartition does not change the RDD, but
instead returns *a new RDD *with more partitions.
On Tue, Apr 14, 2015 at 3:59 AM, Masf wrote:
> Hi.
>
> It doesn't work.
>
> val file = SqlContext.parquetfile("hdfs://node1/user/hive/warehouse/
> file.parquet")
> file.repartition(127)
>
> println(h.partitions.size.toString()) <-- Return 27!
>
> Regards
>
>
> On Fri, Apr 10, 2015 at 4:50 PM, Felix C
> wrote:
>
>> RDD.repartition(1000)?
>>
>> --- Original Message ---
>>
>> From: "Masf"
>> Sent: April 9, 2015 11:45 PM
>> To: user@spark.apache.org
>> Subject: Increase partitions reading Parquet File
>>
>> Hi
>>
>> I have this statement:
>>
>> val file =
>> SqlContext.parquetfile("hdfs://node1/user/hive/warehouse/file.parquet")
>>
>> This code generates as many partitions as files are. So, I want to
>> increase the number of partitions.
>> I've tested coalesce (file.coalesce(100)) but the number of partitions
>> doesn't change.
>>
>> How can I increase the number of partitions?
>>
>> Thanks
>>
>> --
>>
>>
>> Regards.
>> Miguel Ángel
>>
>
>
>
> --
>
>
> Saludos.
> Miguel Ángel
>