from:"\"hnahak\""

Re: Spark Performance on Yarn

2015-04-21 Thread hnahak

Try --executor-memory 5g , because you have 8 gb RAM in each machine -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Performance-on-Yarn-tp21729p22603.html Sent from the Apache Spark User List mailing list archive at Nabble.com. -

Re: newAPIHadoopRDD file name

2015-04-19 Thread hnahak

In record reader level you can pass the file name as key or value. sc.newAPIHadoopRDD(job.getConfiguration, classOf[AvroKeyInputFormat[myObject]], classOf[AvroKey[myObject]], classOf[Text] // can contain your file) AvroKeyInputFormat extends InputFormat { cretaRecor

Re: GraphX: unbalanced computation and slow runtime on livejournal network

2015-04-19 Thread hnahak

Hi Steve i did spark 1.3.0 page rank bench-marking on soc-LiveJournal1 in 4 node cluster. 16,16,8,8 Gbs ram respectively. Cluster have 4 worker including master with 4,4,2,2 CPUs I set executor memroy to 3g and driver to 5g. No. of Iterations --> GraphX(mins) 1 --> 1 2

Data frames in GraphX

2015-04-19 Thread hnahak

To Spark-admin, I like the data frames in 1.3 version, is there any plan to integrate this with Graphx in 1.4 or later ? currently I have huge information in vertex property, if I can use data frames to hold the properties instead of VerexRDD, that will help me a lot. -- View this mess

how to make a spark cluster ?

2015-04-19 Thread hnahak

Hi All, I've big physical machine with 16 CPUs , 256 GB RAM, 20 TB Hard disk. I just need to know what should be the best solution to make a spark cluster? If I need to process TBs of data then 1. Only one machine, which contain driver, executor, job tracker and task tracker everything. 2. crea

Re: GraphX:java.lang.NoSuchMethodError:org.apache.spark.graphx.Graph$.apply

2015-02-26 Thread hnahak

I can able to run it without any issue from standalone as well as in cluster. spark-submit --class org.graphx.test.GraphFromVerteXEdgeArray --executor-memory 1g --driver-memory 6g --master spark://VM-Master:7077 spark-graphx.jar code is exact same as above -- View this message in context:

Re: How to send user variables from Spark client to custom InputFormat or RecordReader ?

2015-02-22 Thread hnahak

Instead of setting in SparkConf , set it into SparkContext.hadoopconfiguration.set(key,value) and from JobContext extract same key. --Harihar -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-send-user-variables-from-Spark-client-to-custom-InputForma

Re: Posting to the list

2015-02-22 Thread hnahak

I'm also facing the same issue, this is third time whenever I post anything it never accept by the community and at the same time got a failure mail in my register mail id. and when click to "subscribe to this mailing list" link, i didnt get any new subscription mail in my inbox. Please anyone

How to send user variables from Spark client to custom InputFormat or RecordReader ?

2015-02-22 Thread hnahak

Hi, I have written custom InputFormat and RecordReader for Spark, I need to use user variables from spark client program. I added them in SparkConf val sparkConf = new SparkConf().setAppName(args(0)).set("developer","MyName") *and in InputFormat class* protected boolean isSplitabl

Re: connector for CouchDB

2015-02-03 Thread hnahak

Spark Doesn't support it, but this connector is open source, you can get it from github. The difference between these two DB is depending on what type of solution you are looking for. Please refer this link : http://blog.nahurst.com/visual-guide-to-nosql-systems FYI, from the list of NOSQL in

Re: We are migrating Tera Data SQL to Spark SQL. Query is taking long time. Please have a look on this issue

2015-01-29 Thread hnahak

do set executor memory as well. You have RAM in each node and storage. set it o 6 GB or more , if require change driver memory from 10 gb to more. --Harihar -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/We-are-migrating-Tera-Data-SQL-to-Spark-SQL-Quer

Re: data locality in logs

2015-01-28 Thread hnahak

Hi How to set a preferred location for an InputSplit in spark standalone? I have data in specific machine and I want to read them using Splits which is created for that node only, by assigning some property which help Spark to create a split in that node only. -- View this message in co

Re: Data Locality

2015-01-28 Thread hnahak

I have wrote a custom input split and I want to set to the specific node, where my data is stored. but currently split can start at any node and pick data from different node in the cluster. any suggestion, how to set host in spark -- View this message in context: http://apache-spark-user-

Re: Spark Performance on Yarn

Re: newAPIHadoopRDD file name

Re: GraphX: unbalanced computation and slow runtime on livejournal network

Data frames in GraphX

how to make a spark cluster ?

Re: GraphX:java.lang.NoSuchMethodError:org.apache.spark.graphx.Graph$.apply

Re: How to send user variables from Spark client to custom InputFormat or RecordReader ?

Re: Posting to the list

How to send user variables from Spark client to custom InputFormat or RecordReader ?

Re: connector for CouchDB

Re: We are migrating Tera Data SQL to Spark SQL. Query is taking long time. Please have a look on this issue

Re: data locality in logs

Re: Data Locality

13 matches

Site Navigation

Mail list logo

Footer information