?
Regards
Bala
On 14-Jul-2016 2:37 pm, "Sun Rui" wrote:
> Where is argsList defined? is Launcher.main() thread-safe? Note that if
> multiple folders are processed in a node, multiple threads may concurrently
> run in the executor, each processing a folder.
>
> On Jul 14, 2016,
>
> Hello Ted,
>
Thanks for the response. Here is the additional information.
> I am using spark 1.6.1 (spark-1.6.1-bin-hadoop2.6)
>
>
>
> Here is the code snippet
>
>
>
>
>
> JavaRDD add = jsc.parallelize(listFolders, listFolders.size());
>
> JavaRDD test = add.map(new Function()
Hello
In one of my use cases, i need to process list of folders in parallel. I
used
Sc.parallelize (list,list.size).map(" logic to process the folder").
I have a six node cluster and there are six folders to process. Ideally i
expect that each of my node process one folder. But, i see that a no
Hello,
I have one apache spark based simple use case that process two datasets.
Each dataset takes about 5-7 min to process. I am doing this processing
inside the sc.parallelize(datasets){ } block. While the first dataset is
processed successfully, the processing of dataset is not started by spar
older = iter.next
> val status: Int =
> Seq(status).toIterator
> }
>
> On Jun 30, 2016, at 16:42, Balachandar R.A.
> wrote:
>
> Hello,
>
> I have some 100 folders. Each folder contains 5 files. I have an
> executable that process one folder. The executable is a b
val folder = iter.next
> val status: Int =
> Seq(status).toIterator
> }
>
> On Jun 30, 2016, at 16:42, Balachandar R.A.
> wrote:
>
> Hello,
>
> I have some 100 folders. Each folder contains 5 files. I have an
> executable that process one folder. The executabl
Hello,
I have some 100 folders. Each folder contains 5 files. I have an executable
that process one folder. The executable is a black box and hence it cannot
be modified.I would like to process 100 folders in parallel using Apache
spark so that I should be able to span a map task per folder. Can a
I am new to GraphX and exploring example flight data analysis found on
online.
http://www.sparktutorials.net/analyzing-flight-data:-a-gentle-introduction-to-graphx-in-spark
I tried calculating inDegrees (understand how many incoming flights to an
airport) but I see value which corresponds to uniqu
Thanks... Will look into that
- Bala
On 28 January 2016 at 15:36, Sahil Sareen wrote:
> Try Neo4j for visualization, GraphX does a pretty god job at distributed
> graph processing.
>
> On Thu, Jan 28, 2016 at 12:42 PM, Balachandar R.A. <
> balachandar...@gmail.com> wrot
Hi
I am new to GraphX. I have a simple csv file which I could load and compute
few graph statistics. However, I am not sure whether it is possible to
create ad show graph (for visualization purpose) using GraphX. Any pointer
to tutorial or information connected to this will be really helpful
Than
Hello users,
In one of my usecases, I need to launch a spark job from spark-shell. My
input file is in HDFS and I am using NewHadoopRDD to construct RDD out of
this input file as it uses custom input format.
val hConf = sc.hadoopConfiguration
var job = new Job(hConf)
FileInputForm
ictionnary and map those values to numbers?
>
> Cheers
> Guillaume
>
> On 5 November 2015 at 09:54, Balachandar R.A.
> wrote:
>
>> HI
>>
>>
>> I am new to spark MLlib and machine learning. I have a csv file that
>> consists of around 100 thousan
HI
I am new to spark MLlib and machine learning. I have a csv file that
consists of around 100 thousand rows and 20 columns. Of these 20 columns,
10 contains string values. Each value in these columns are not necessarily
unique. They are kind of categorical, that is, the values could be one
amoun
I made a stupid mistake it seems. I supplied the --master option to the
spark url in my launch command. And this error is gone.
Thanks for pointing out possible places for troubleshooting
Regards
Bala
On 02-Nov-2015 3:15 pm, "Balachandar R.A." wrote:
> No.. I am not using yarn
No.. I am not using yarn. Yarn is not running in my cluster. So, it is
standalone one.
Regards
Bala
On 02-Nov-2015 3:11 pm, "Jean-Baptiste Onofré" wrote:
> Just to be sure: you use yarn cluster (not standalone), right ?
>
> Regards
> JB
>
> On 11/02/2015 10:3
Engineer*
> http://www.totango.com
>
> On Mon, Nov 2, 2015 at 11:27 AM, Balachandar R.A. <
> balachandar...@gmail.com> wrote:
>
>>
>> -- Forwarded message --
>> From: "Balachandar R.A."
>> Date: 02-Nov-2015 12:53 pm
&g
-- Forwarded message --
From: "Balachandar R.A."
Date: 02-Nov-2015 12:53 pm
Subject: Re: Error : - No filesystem for scheme: spark
To: "Jean-Baptiste Onofré"
Cc:
> HI JB,
> Thanks for the response,
> Here is the content of my spark-defaults.conf
>
Can someone tell me at what point this error could come?
In one of my use cases, I am trying to use hadoop custom input format. Here
is my code.
val hConf: Configuration = sc.hadoopConfiguration
hConf.set("fs.hdfs.impl",
classOf[org.apache.hadoop.hdfs.DistributedFileSystem].getName)
hConf.set("fs
Hello,
I have developed a hadoop based solution that process a binary file. This
uses classic hadoop MR technique. The binary file is about 10GB and divided
into 73 HDFS blocks, and the business logic written as map process operates
on each of these 73 blocks. We have developed a customInputForma
19 matches
Mail list logo