[Problem Solved]Re: Spark partition size tuning

2016-01-27 Thread Jia Zou
Hi, dears, the problem has been solved.
I mistakely use tachyon.user.block.size.bytes instead of
tachyon.user.block.size.bytes.default. It works now. Sorry for the
confusion and thanks again to Gene!

Best Regards,
Jia

On Wed, Jan 27, 2016 at 4:59 AM, Jia Zou  wrote:

> Hi, Gene,
>
> Thanks for your suggestion.
> However, even if I set tachyon.user.block.size.bytes=134217728, and I can
> see that from the web console, the files that I load to Tachyon via
> copyToLocal, still has 512MB block size.
> Do you have more suggestions?
>
> Best Regards,
> Jia
>
> On Tue, Jan 26, 2016 at 11:46 PM, Gene Pang  wrote:
>
>> Hi Jia,
>>
>> If you want to change the Tachyon block size, you can set the
>> tachyon.user.block.size.bytes.default parameter (
>> http://tachyon-project.org/documentation/Configuration-Settings.html).
>> You can set it via extraJavaOptions per job, or adding it to
>> tachyon-site.properties.
>>
>> I hope that helps,
>> Gene
>>
>> On Mon, Jan 25, 2016 at 8:13 PM, Jia Zou  wrote:
>>
>>> Dear all,
>>>
>>> First to update that the local file system data partition size can be
>>> tuned by:
>>> sc.hadoopConfiguration().setLong("fs.local.block.size", blocksize)
>>>
>>> However, I also need to tune Spark data partition size for input data
>>> that is stored in Tachyon (default is 512MB), but above method can't work
>>> for Tachyon data.
>>>
>>> Do you have any suggestions? Thanks very much!
>>>
>>> Best Regards,
>>> Jia
>>>
>>>
>>> -- Forwarded message --
>>> From: Jia Zou 
>>> Date: Thu, Jan 21, 2016 at 10:05 PM
>>> Subject: Spark partition size tuning
>>> To: "user @spark" 
>>>
>>>
>>> Dear all!
>>>
>>> When using Spark to read from local file system, the default partition
>>> size is 32MB, how can I increase the partition size to 128MB, to reduce the
>>> number of tasks?
>>>
>>> Thank you very much!
>>>
>>> Best Regards,
>>> Jia
>>>
>>>
>>
>


Re: Spark partition size tuning

2016-01-27 Thread Jia Zou
Hi, Gene,

Thanks for your suggestion.
However, even if I set tachyon.user.block.size.bytes=134217728, and I can
see that from the web console, the files that I load to Tachyon via
copyToLocal, still has 512MB block size.
Do you have more suggestions?

Best Regards,
Jia

On Tue, Jan 26, 2016 at 11:46 PM, Gene Pang  wrote:

> Hi Jia,
>
> If you want to change the Tachyon block size, you can set the
> tachyon.user.block.size.bytes.default parameter (
> http://tachyon-project.org/documentation/Configuration-Settings.html).
> You can set it via extraJavaOptions per job, or adding it to
> tachyon-site.properties.
>
> I hope that helps,
> Gene
>
> On Mon, Jan 25, 2016 at 8:13 PM, Jia Zou  wrote:
>
>> Dear all,
>>
>> First to update that the local file system data partition size can be
>> tuned by:
>> sc.hadoopConfiguration().setLong("fs.local.block.size", blocksize)
>>
>> However, I also need to tune Spark data partition size for input data
>> that is stored in Tachyon (default is 512MB), but above method can't work
>> for Tachyon data.
>>
>> Do you have any suggestions? Thanks very much!
>>
>> Best Regards,
>> Jia
>>
>>
>> -- Forwarded message --
>> From: Jia Zou 
>> Date: Thu, Jan 21, 2016 at 10:05 PM
>> Subject: Spark partition size tuning
>> To: "user @spark" 
>>
>>
>> Dear all!
>>
>> When using Spark to read from local file system, the default partition
>> size is 32MB, how can I increase the partition size to 128MB, to reduce the
>> number of tasks?
>>
>> Thank you very much!
>>
>> Best Regards,
>> Jia
>>
>>
>


Re: Spark partition size tuning

2016-01-26 Thread Gene Pang
Hi Jia,

If you want to change the Tachyon block size, you can set the
tachyon.user.block.size.bytes.default parameter (
http://tachyon-project.org/documentation/Configuration-Settings.html). You
can set it via extraJavaOptions per job, or adding it to
tachyon-site.properties.

I hope that helps,
Gene

On Mon, Jan 25, 2016 at 8:13 PM, Jia Zou  wrote:

> Dear all,
>
> First to update that the local file system data partition size can be
> tuned by:
> sc.hadoopConfiguration().setLong("fs.local.block.size", blocksize)
>
> However, I also need to tune Spark data partition size for input data that
> is stored in Tachyon (default is 512MB), but above method can't work for
> Tachyon data.
>
> Do you have any suggestions? Thanks very much!
>
> Best Regards,
> Jia
>
>
> -- Forwarded message --
> From: Jia Zou 
> Date: Thu, Jan 21, 2016 at 10:05 PM
> Subject: Spark partition size tuning
> To: "user @spark" 
>
>
> Dear all!
>
> When using Spark to read from local file system, the default partition
> size is 32MB, how can I increase the partition size to 128MB, to reduce the
> number of tasks?
>
> Thank you very much!
>
> Best Regards,
> Jia
>
>


Re: Spark partition size tuning

2016-01-26 Thread Pavel Plotnikov
Hi,
May be *sc.hadoopConfiguration.setInt( "dfs.blocksize", blockSize ) *helps
you

Best Regards,
Pavel

On Tue, Jan 26, 2016 at 7:13 AM Jia Zou  wrote:

> Dear all,
>
> First to update that the local file system data partition size can be
> tuned by:
> sc.hadoopConfiguration().setLong("fs.local.block.size", blocksize)
>
> However, I also need to tune Spark data partition size for input data that
> is stored in Tachyon (default is 512MB), but above method can't work for
> Tachyon data.
>
> Do you have any suggestions? Thanks very much!
>
> Best Regards,
> Jia
>
>
> -- Forwarded message ------
> From: Jia Zou 
> Date: Thu, Jan 21, 2016 at 10:05 PM
> Subject: Spark partition size tuning
> To: "user @spark" 
>
>
> Dear all!
>
> When using Spark to read from local file system, the default partition
> size is 32MB, how can I increase the partition size to 128MB, to reduce the
> number of tasks?
>
> Thank you very much!
>
> Best Regards,
> Jia
>
>


Fwd: Spark partition size tuning

2016-01-25 Thread Jia Zou
Dear all,

First to update that the local file system data partition size can be tuned
by:
sc.hadoopConfiguration().setLong("fs.local.block.size", blocksize)

However, I also need to tune Spark data partition size for input data that
is stored in Tachyon (default is 512MB), but above method can't work for
Tachyon data.

Do you have any suggestions? Thanks very much!

Best Regards,
Jia


-- Forwarded message --
From: Jia Zou 
Date: Thu, Jan 21, 2016 at 10:05 PM
Subject: Spark partition size tuning
To: "user @spark" 


Dear all!

When using Spark to read from local file system, the default partition size
is 32MB, how can I increase the partition size to 128MB, to reduce the
number of tasks?

Thank you very much!

Best Regards,
Jia


Spark partition size tuning

2016-01-21 Thread Jia Zou
Dear all!

When using Spark to read from local file system, the default partition size
is 32MB, how can I increase the partition size to 128MB, to reduce the
number of tasks?

Thank you very much!

Best Regards,
Jia