Re: how to make a spark cluster ?

2015-04-21 Thread haihar nahak
I did some performance check on socLiveJournal PageRank b/w my local
machine (8 cores, 16 gb ) in local mode and my small cluster (4 nodes, 12
cores, 40 gb) and i found cluster mode is way faster than local mode. So I
confused.
no. of iterations ---> Local mode(in mins) --> cluster mode(in mins)
1
 20 1 231.3 1.2 3   39.5 1.3 556.4 1.6 10   117.26 2.6
with the help of this , I think , might be installing spark cluster on the
same machine and instead of giving local[no. of cores] , I'll set to
spark://host:7070.

Please let me know If I wrong somewhere.


On Tue, Apr 21, 2015 at 6:27 PM, Reynold Xin  wrote:

> Actually if you only have one machine, just use the Spark local mode.
>
> Just download the Spark tarball, untar it, set master to local[N], where N
> = number of cores. You are good to go. There is no setup of job tracker or
> Hadoop.
>
>
> On Mon, Apr 20, 2015 at 3:21 PM, haihar nahak 
> wrote:
>
>> Thank you :)
>>
>> On Mon, Apr 20, 2015 at 4:46 PM, Jörn Franke 
>> wrote:
>>
>>> Hi, If you have just one physical machine then I would try out Docker
>>> instead of a full VM (would be waste of memory and CPU).
>>>
>>> Best regards
>>> Le 20 avr. 2015 00:11, "hnahak"  a écrit :
>>>
>>>> Hi All,
>>>>
>>>> I've big physical machine with 16 CPUs , 256 GB RAM, 20 TB Hard disk. I
>>>> just
>>>> need to know what should be the best solution to make a spark cluster?
>>>>
>>>> If I need to process TBs of data then
>>>> 1. Only one machine, which contain driver, executor, job tracker and
>>>> task
>>>> tracker everything.
>>>> 2. create 4 VMs and each VM should consist 4 CPUs , 64 GB RAM
>>>> 3. create 8 VMs and each VM should consist 2 CPUs , 32 GB RAM each
>>>>
>>>> please give me your views/suggestions
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/how-to-make-a-spark-cluster-tp22563.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> -
>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>
>>>>
>>
>>
>> --
>> {{{H2N}}}-(@:
>>
>
>


-- 
{{{H2N}}}-(@:


Re: how to make a spark cluster ?

2015-04-20 Thread haihar nahak
Thank you :)

On Mon, Apr 20, 2015 at 4:46 PM, Jörn Franke  wrote:

> Hi, If you have just one physical machine then I would try out Docker
> instead of a full VM (would be waste of memory and CPU).
>
> Best regards
> Le 20 avr. 2015 00:11, "hnahak"  a écrit :
>
>> Hi All,
>>
>> I've big physical machine with 16 CPUs , 256 GB RAM, 20 TB Hard disk. I
>> just
>> need to know what should be the best solution to make a spark cluster?
>>
>> If I need to process TBs of data then
>> 1. Only one machine, which contain driver, executor, job tracker and task
>> tracker everything.
>> 2. create 4 VMs and each VM should consist 4 CPUs , 64 GB RAM
>> 3. create 8 VMs and each VM should consist 2 CPUs , 32 GB RAM each
>>
>> please give me your views/suggestions
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/how-to-make-a-spark-cluster-tp22563.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>


-- 
{{{H2N}}}-(@:


Re: How to send user variables from Spark client to custom InputFormat or RecordReader ?

2015-02-22 Thread haihar nahak
Thanks. I extract hadoop configuration and set a my arbitrary variable and
able to get inside InputFormat from JobContext.configuration

On Mon, Feb 23, 2015 at 12:04 PM, Tom Vacek  wrote:

> The SparkConf doesn't allow you to set arbitrary variables.  You can use
> SparkContext's HadoopRDD and create a JobConf (with whatever variables you
> want), and then grab them out of the JobConf in your RecordReader.
>
> On Sun, Feb 22, 2015 at 4:28 PM, hnahak  wrote:
>
>> Hi,
>>
>> I have written custom InputFormat and RecordReader for Spark, I need  to
>> use
>> user variables from spark client program.
>>
>> I added them in SparkConf
>>
>>  val sparkConf = new
>> SparkConf().setAppName(args(0)).set("developer","MyName")
>>
>> *and in InputFormat class*
>>
>> protected boolean isSplitable(JobContext context, Path filename) {
>>
>>
>> System.out.println("# Developer "
>> + context.getConfiguration().get("developer") );
>> return false;
>> }
>>
>> but its return me *null* , is there any way I can pass user variables to
>> my
>> custom code?
>>
>> Thanks !!
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-send-user-variables-from-Spark-client-to-custom-InputFormat-or-RecordReader-tp21755.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


-- 
{{{H2N}}}-(@:


Re: Posting to the list

2015-02-22 Thread haihar nahak
I checked it but I didn't see any mail from user list. Let me do it one
more time.

[image: Inline image 1]

--Harihar

On Mon, Feb 23, 2015 at 11:50 AM, Ted Yu  wrote:

> bq. i didnt get any new subscription mail in my inbox.
>
> Have you checked your Spam folder ?
>
> Cheers
>
> On Sun, Feb 22, 2015 at 2:36 PM, hnahak  wrote:
>
>> I'm also facing the same issue, this is third time whenever I post
>> anything
>> it never accept by the community and at the same time got a failure mail
>> in
>> my register mail id.
>>
>> and when click to "subscribe to this mailing list" link, i didnt get any
>> new
>> subscription mail in my inbox.
>>
>> Please anyone suggest a best way to subscribed the email ID
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Posting-to-the-list-tp21750p21756.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


-- 
{{{H2N}}}-(@: