LARGE COLLECT

2015-10-26 Thread shahid qadri
Hi Folks This might sound not apprioprate but i want to collect large data(15G approx).. and do some processing on driver and broadcast that back to each nodes. Is there any option to collect data off_heap.. like we can store rdd off heap. i.e sort of collect data on tachyon FS...

repartition vs partitionby

2015-10-17 Thread shahid qadri
Hi folks I need to reparation large set of data around(300G) as i see some portions have large data(data skew) i have pairRDDs [({},{}),({},{}),({},{})] what is the best way to solve the the problem - To unsubscribe, e-mail: us

Build Failure

2015-10-08 Thread shahid qadri
hi I tried to build latest master branch of spark build/mvn -DskipTests clean package Reactor Summary: [INFO] [INFO] Spark Project Parent POM ... SUCCESS [03:46 min] [INFO] Spark Project Test Tags SUCCESS [01:02 min] [INFO] Spark Project Laun

Re: API to run spark Jobs

2015-10-06 Thread shahid qadri
r example EMR in AWS has a job submit UI. > > Spark submit just calls a REST api, you could build any UI you want on top of > that... > > > On Tue, Oct 6, 2015 at 9:37 AM, shahid qadri <mailto:shahidashr...@icloud.com>> wrote: > Hi Folks > > How i can

API to run spark Jobs

2015-10-06 Thread shahid qadri
Hi Folks How i can submit my spark app(python) to the cluster without using spark-submit, actually i need to invoke jobs from UI - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h.

Custom Partitioner

2015-09-01 Thread shahid qadri
Hi Sparkians How can we create a customer partition in pyspark - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: How to effieciently write sorted neighborhood in pyspark

2015-09-01 Thread shahid qadri
> On Aug 25, 2015, at 10:43 PM, shahid qadri wrote: > > Any resources on this > >> On Aug 25, 2015, at 3:15 PM, shahid qadri wrote: >> >> I would like to implement sorted neighborhood approach in spark, what is the >&g

Re: How to effieciently write sorted neighborhood in pyspark

2015-08-25 Thread shahid qadri
Any resources on this > On Aug 25, 2015, at 3:15 PM, shahid qadri wrote: > > I would like to implement sorted neighborhood approach in spark, what is the > best way to write that in pyspark. - To unsubscribe,

How to effieciently write sorted neighborhood in pyspark

2015-08-25 Thread shahid qadri
I would like to implement sorted neighborhood approach in spark, what is the best way to write that in pyspark. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.or

Re: disabling dynamic date time formatting in python api or globally

2015-02-15 Thread Shahid Qadri
;-) > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > Le 15 févr. 2015 à 14:44, Shahid Qadri > > a écrit : > > guys getting this error > > raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, > error_message, additional_info) > RequestError: Tr

disabling dynamic date time formatting in python api or globally

2015-02-15 Thread Shahid Qadri
guys getting this error raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info) RequestError: TransportError(400, u'MapperParsingException[failed to parse [SOURCES.DATE_COMP]]; nested: MapperParsingException[failed to parse date field [--], tried