Thanks, Evan and Andy:
Here a very functional version, i need to improve the syntax, but this works
very well, the initial version takes around 36 hours in a 9 machines with 8
cores, and this version takes 36 minutes in a cluster with 7 machines with 8
cores :
object SimpleApp {
def
Hi Evan,
here a improved version, thanks for your advice. But you know the last step,
the SaveAsTextFile is very Slw, :(
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import java.net.URL
import java.text.SimpleDateFormat
import
:
Connection refused: axaxaxa-cloudera-s05.xxxnetworks.com/10.5.96.42:43942
On Mon, Sep 15, 2014 at 1:30 PM, Abel Coronado Iruegas
acoronadoirue...@gmail.com wrote:
Here an example of a working code that takes a csv with lat lon points and
intersects with polygons of municipalities of Mexico
Here an example of a working code that takes a csv with lat lon points and
intersects with polygons of municipalities of Mexico, generating a new
version of the file with new attributes.
Do you think that could be improved?
Thanks.
The Code:
import org.apache.spark.SparkContext
import
Hi Yifan
This works for me:
export SPARK_JAVA_OPTS=-Xms10g -Xmx40g -XX:MaxPermSize=10g
export ADD_JARS=/home/abel/spark/MLI/target/MLI-assembly-1.0.jar
export SPARK_MEM=40g
./spark-shell
Regards
On Mon, Jul 21, 2014 at 7:48 AM, Yifan LI iamyifa...@gmail.com wrote:
Hi,
I am trying to load
Hi everybody
We have hortonworks cluster with many nodes, we want to test a deployment
of Spark. Whats the recomended path to follow?
I mean we can compile the sources in the Name Node. But i don't really
understand how to pass the executable jar and configuration to the rest of
the nodes.
Hi everybody
Someone can tell me if it is possible to read and filter a 60 GB file of
tweets (Json Docs) in a Standalone Spark Deployment that runs in a single
machine with 40 Gb RAM and 8 cores???
I mean, is it possible to configure Spark to work with some amount of
memory (20 GB) and the rest
the use a combination of resources (Memory
processing Disk processing) still remains.
Thanks !!
On Fri, Jul 4, 2014 at 9:49 AM, Abel Coronado Iruegas
acoronadoirue...@gmail.com wrote:
Hi everybody
Someone can tell me if it is possible to read and filter a 60 GB file of
tweets (Json Docs
Thank you, DataBricks Rules
On Fri, Jul 4, 2014 at 1:58 PM, Michael Armbrust mich...@databricks.com
wrote:
sqlContext.jsonFile(data.json) Is this already available in the
master branch???
Yes, and it will be available in the soon to come 1.0.1 release.
But the question about